Your single source for new lessons on legal technology, e-discovery, and the people innovating behind the scenes.

Making a Case for Machine Learning to Legal Departments

Dean Gonsowski

This article originally appeared on the Fast Forward Labs blog. Check it out here.

e-Discovery software is not used exclusively by law firms or service providers—today’s corporate counsel are facing constant pressure to reduce costs in litigation, especially when faced with numerous discovery requests. This trend is getting more challenging due to exponential data growth and content proliferation. IDC’s Digital Universe study predicts the world's data will amount to 44 zettabytes by 2020, which will have a significant impact on the compliance and e-discovery needs of organizations on a global scale. Corporate legal departments are aware of the trend, but an often-heard challenge is understanding what tools are available to combat this data deluge.

Tackling Corporate Data Challenges

Several years ago the Rand Institute of Civil Justice conducted a study about litigant expenditures in e-discovery, where it found that corporations spend just eight percent of their discovery costs on collecting the data, 19 percent on processing and normalizing the data, and a whopping 73 percent on review, the driving feature of e-discovery. By applying existing technologies, especially text analytics, to this review phase, corporations can start driving greater savings across the entire e-discovery spectrum.

The report suggests that the increasing volume of digital records makes techniques leveraging machine learning the most cost-effective options to conduct review. Predictive coding or technology-assisted review (TAR), for instance, harnesses supervised machine learning to predict the responsiveness of documents based on prior coding decisions. From our experience, most savvy practitioners see this type of machine learning as an obvious way to increase efficiencies in the review process, but a number of factors have limited adoption. As a result, there is a conspicuous “consumption gap” in legal technology, which emerges from the difference in the current use of technology versus its capabilities. A 2015 PC –TAR Focus Report prepared by the eDJ Group noted that counsel and management often resist analytics technologies due to their limited knowledge of software capabilities, limitations, potential costs, and applications. In other cases, the resistance is simply in the form of the refrain that the old ways are good enough.

How You Can Help

Fortunately, data scientists’ practical experience deriving insight from data can help drive the adoption of machine learning through departments company-wide, including new tools for the legal team. Data scientists have the unique ability to speak to the critical role that technology can play in distinct aspects of the enterprise, unpacking the difference between relying solely on technology to make all decisions versus relying on technology as a power tool operated by experts to lead them to the right location. Corporate counsel will need to begin to take on greater knowledge of data science in order to leverage predictive analytics in e-discovery, which can help the team get to the facts of a case faster by speeding the process of review. Data scientists have the practical responsibility of guiding their corporate legal departments toward informed text analytics strategies, helping them to positively view text analytics as a cost- and time-saving solution.

As a data scientist, there are several ways you can help corporate counsel shift attitudes about machine learning and conquer the consumption gap:

1. Understand the technology

According to the Coalition of Technology Resources for Lawyers (CTRL), there are a range of machine learning applications that can be deployed to facilitate e-discovery. Predictive coding or TAR is one common flavor that refers to “a process for selecting and ranking a collection of documents using a computerized system that incorporates the decisions that lawyers have made on a smaller set of documents and then applies those decisions to the remaining universe of documents.” Other tools include advanced search capabilities, near-duplicate document detection, and email threading. The important point to share with your legal team is that machine learning and other text analytics tools help augment lawyers’ decisions. The tools do not need to take control of the process.

2. Prove the machine learning methodology can hold up in court

While predictive coding and TAR have been around for several years, legal professionals do have a responsibility for a defensible case in court, and might be worried about judicial approval for these somewhat newer tools. Fortunately, you can explain the defensibility of analytics in e-discovery by referencing well-known court cases such as Rio Tinto Plc v. Vale S.A. or more recent decisions such as Pyrrho Investments Ltd. v. MWB Property Ltd., where judges in both the United States and the United Kingdom cited greater consistency, proportionality, and cost savings as reasons for using an assistive coding method like TAR over a less efficient manual review process.

3. Utilize available workflows

Advise your team to be on the lookout for cases that would be a good fit for analytics. It’s best to leverage these types of tools from the start of a project and fully incorporate them into a review strategy. Techniques like categorization, concept clustering, and email threading are frequently used prior to review, and applying these technologies in a single system is key to successful use earlier in the e-discovery process. Have a thorough understanding of the best practices and proven workflows for your legal team’s chosen technology, as your expertise in text analytics could be leveraged at the beginning of the adoption cycle and all the way through a case.

The Future

Early adopters are using emerging text analytics technologies like predictive coding without too much fanfare, and these technologies will be the industry standard for attorneys in the future. The only remaining question is how soon practitioners cross the chasm and start embracing these technologies in the near term, and how they can be helped along the way.


A former litigator/GC/AGC, Dean Gonsowski is an industry-recognized evangelist, thought leader, and speaker. Dean has a JD from the University of San Diego School of Law and a BS from the University of California, Santa Barbara. He has worked with companies around the e-discovery industry, including Relativity, and now serves as chief revenue officer of Active Navigation.

The latest insights, trends, and spotlights — directly to your inbox.

The Relativity Blog covers the latest in legal tech and compliance, professional development topics, and spotlights on the many bright minds in our space. Subscribe today to learn something new, stay ahead of emerging tech, and up-level your career.

Interested in being one of our authors? Learn more about how to contribute to The Relativity Blog.