Your single source for new lessons on legal technology, e-discovery, compliance, and the people innovating behind the scenes.

Is Predictive Coding Defensible?

Greg Houston

With increasing amounts of data in discovery, the potential of computer-assisted review has been on the minds of legal professionals for a number of years. Still, many in the legal community have taken a wait-and-see approach to employing predictive coding. While adoption of the technology is growing, one question remains: Is it defensible?

To answer this question, industry experts have drilled into the statistics behind computer-assisted review, while several noteworthy court cases have added fuel to the discussion. We see these as the two biggest signs computer-assisted review is moving to the mainstream.

Testing the Statistical Soundness of Predictive Coding

Reasonably, the legal community has looked to statistics when evaluating predictive coding. Several computer-assisted review white papers have included a statistical analysis of the workflow and technology. Here are two, which can be accessed from this page:

  1. Dr. Gideon Frieder’s white paper, Common Statistical Concepts and Their Influence on Computer-assisted Review, provides an overview of the basic statistics behind predictive coding. Based on Dr. Frieder’s paper, we understand that:
    • A basic requirement for accurate results is an adequate sample size. Fulfilling that requirement means a defensible and statistically valid project.
    • Unless every document is manually reviewed—which could mean sifting through millions of documents by hand—a small amount of error is always possible.
    • An expert team can choose to review an appropriate, select number and quality of sample documents to keep your project proportional while remaining confident in the results.
  1. Additionally, when conducting a study on the validity of Relativity Assisted Review, Dr. David Grossman found that the statistical sample of documents accurately reflected the full population. In his study, entitled Measuring and Validating the Effectiveness of Relativity Assisted Review, he found that:
    • With each round of computer-assisted review, document categorization improved.
    • It took just three responsive and non-responsive rounds for noticeably improved results.
    • Based on the fact that his results adhered to a narrow margin of error, he concluded that the samples used were representative.

Based on these statistical concepts, computer-assisted review has been found to be as effective as manual review methods.

Coding Comes Up in Courtroom Conversations

With statistics on their side, law firms and litigation support teams are becoming more vocal in favoring the use of computer-assisted review, but they aren’t the only ones. More judges are seeing the validity and value in the practice because it offers both cost and time savings to litigants.

With a number of judges limiting review hours—including Judge Paul W. Grimm and Judge Lorna Schofield—predictive coding technology is becoming necessary in some cases and a point of conversation in several others since Judge Andrew Peck allowed the use of predictive coding in the landmark 2012 Da Silva Moore case.

Recently, the US Tax Court approved the use of computer-assisted review in a case between Dynamo Holdings and the IRS. When the IRS asked for access to complete backup tapes during discovery, Dynamo Holdings requested to use predictive coding technology to protect privileged data and reduce review costs. Judge Ronald L. Buch compared computer-assisted review with traditional manual review methods and found it to be a reasonable compromise. Additionally, he pointed out that the respondent could file a motion to compel if they believed the production to be incomplete, just as they could with a manual review method.

These are just a handful of examples of predictive coding’s increasing presence in the courtroom.

Looking Forward

Since making its first waves in the e-discovery industry just a few years ago, computer-assisted review has come a long way. The technology—which has been relied on by other industries for decades—is having a lasting impact on the way legal professionals handle Big Data during litigation and investigation. We’re excited to see it help more and more case teams solve even bigger challenges as growing data volumes continue. If your team could use some guidance setting up a predictive coding workflow—whether it’s your first project or your tenth—let us know. We’re here to help.

Greg Houston is a member of the Relativity customer success team, providing guidance on customized e-discovery workflows that fit the unique needs of every case team. Greg previously served as litigation support project manager at various Chicago law firms and has 12 years of experience managing small and large cases from collection to trial.