While technology-assisted review (TAR)—also known as computer-assisted review or predictive coding—has been gaining traction in the legal space, the discussions have focused on the reliability or defensibility of TAR tools and workflows. Today, there is a growing need for practical, pragmatic advice for those considering trying TAR.
In a webinar on the topic hosted by Relativity Best in Service Partner Modus, I addressed the practical side of using TAR and gave advice on commonly overlooked considerations and misunderstood steps that can trip up those new to it. With TAR, it’s important to plan the right process and set the right expectations for the team. Below are some takeaways.
Integrating the Right Team for the Job
For a TAR project to be successful, it is important to understand that the process needs to be supported by an integrated team of the right professionals and the right technology.
First is the case’s subject matter expert (SME)—someone intimately familiar with the issues of the case and able to facilitate consistent coding decisions on documents.
Second is someone who understands how the chosen technology works, can communicate with the case SME, and can analyze the results of the process throughout the case, making recommendations along the way. It is important that this technology support professional communicates often with the case SME to ensure not only that documents are coded correctly, but that the case SME understands what makes a particular document a good or bad example for training the system. Not all documents that are important to the case are good for training purposes.
As the case team goes through the process of culling large data sets or prioritizing documents for further review (two common approaches to using TAR), it is important that the technology support professional tracks the process against key milestones or methodologies chosen, using the best approach to statistical sampling or validating the results by some reliable metric, such as overturn percentage or precision/recall. In many cases, this information will be essential to demonstrating the defensibility of your process.
Setting the Right Expectations
Once you have gathered the right team for a TAR project, the next challenge is setting the right expectations. It is important to sit down with all stakeholders—including in-house and outside counsel, relevant paralegals and project support staff, litigation support, and technical support staff—to make sure the right expectations are set regarding the progression of the TAR project.
For example, it must be made known that TAR is an iterative process requiring more than just one round of human coding, machine classifying, and QC. The minimum number of rounds I’ve experienced was six, and the maximum was almost 20. It’s impossible to set a standard to apply to all of your projects, so make sure the whole team is aware.
It also must be understood that the human expert’s review speed often will not be as fast as the speed of reviewers in a typical large-scale document review. This is both because the case SME training the system may be seeing document selections without any document organization, and because they must consider both substantive coding and the suitability of each document as a training example.
Some Common TAR Myths
In tandem with setting the right expectations, it is important to note that there are a lot of common myths and misconceptions about TAR.
There is a misconception in our industry that defending this process will be difficult. In fact, if TAR is used for internal reasons and you still intend to have eyes on everything, you will not always have to defend your TAR process. When you do have to defend your process, depending on the protocol, it’s often no different from defending other discovery processes. Audit data and accuracy metrics with a TAR workflow can be as—or more—easily obtained as with a manual workflow, whether you track your process on your own or use the reporting tools available and built into software like Relativity Assisted Review. Use these metrics to defend your process if necessary.
You can also cite its defensibility with research that supplements your own results, as innumerable articles, studies, and rulings can help you demonstrate the reasonableness, value, and reliability of using the TAR process in general.
Another common misconception is the notion that TAR can work from any collected data. In fact, TAR excels at analyzing and classifying materials composed of unstructured text like that found in collections of email, but it is no help at reviewing other kinds of materials, such as number-intensive spreadsheets or media files. Other materials may be better off manually reviewed because of the nature of the content (e.g., technical documentation), the prevalence of atypical language (e.g., chat logs), or the importance of the materials (e.g., hot documents). Understand if this sort of data is included in your case, and prepare to address them in another way if necessary, so your TAR process can focus on conceptually substantive documents.
Finally, a common myth is that sampling randomly from the entire data set is always the best way to find examples for training the system. In fact, intelligently sampling from narrower subsets can be more efficient. For example, Relativity’s stratified sampling option can help improve results and decrease rounds, as the sample documents used for training are pulled from concept groupings that represent the spectrum of subject matter in the data set. With options like these, the more you understand your data, the less randomness is actually needed.