by Greg Houston on July 01, 2015
Combating the time crunch of a second request can be challenging. In a “Second Requests Demystified” webinar we hosted last week, I was joined by Kevin Clark of Compliance Discovery Solutions, Ignatius Grande of Hughes Hubbard & Reed, and John Ingrassia of Proskauer to discuss how the right computer-assisted review workflow can help keep mergers and acquisitions on track.
During the webinar, the group offered insight into how legal teams can tackle these time-sensitive matters accurately and efficiently with the right technology. Read on for insights on the questions we didn’t get to answer during the live event and to watch a recording of the session.
Webinar attendee: Could you provide some metrics, from your experience, comparing a linear human review with a technology-assisted review?
Kevin: When compared with a traditional handling of a second request, the sizes of the review teams and the time needed for the review is significantly reduced when using technology-assisted review. In the telecom merger we discussed during our webinar, we divided the data into two data sets, one with 1.2 million documents and the other with 300,000. In total, humans reviewed 136,000 documents, and that included training the system for both data sets, reviewing the uncategorized data set, and a QC of 10 percent of both the responsive and non-responsive documents. At different stages, we had different team sizes, but overall, the teams were much smaller than what they would have been in a traditional review. The most we had reviewing was a team of 25 when working on the uncategorized data.
How do you decide whether to use random sampling, judgmental sampling, or previously coded documents in your training set?
Kevin: The short answer is it depends on your data and the facts of your situation. For a second request, it may be difficult to have previously coded documents that apply to what the United States Department of Justice is looking for. Judgmental sampling has worked best for us in training the system for these projects.
Greg: What is included in your judgmental sample or pre-coded set determines the effectiveness of the data in training the system. Random sampling is the easiest route because no real knowledge of the data is needed, but you can do a more effective training round if you have a good judgmental sample or pre-coded set. The key is getting a representative group of items from across the database representing as many different concepts as possible.
Which is more important: precision, recall, or overturn rate?
Kevin: In our experience, although precision and recall were discussed and monitored, the overturn rate was most important—particularly the stability of the overturn rate rather than the actual overturn rate itself. You want to minimize the volatility in the overturn rate with each round and see a plateau effect, or leveling off, of that stat.
Data collection rarely seems to be complete at the time I want to start a training set. How do you address this?
Greg: It is quite common that not all documents are present at the beginning of an assisted review project. This isn’t an issue at all because all of your example documents can be used against any set of documents. As the population grows, the system uses the same set of concepts it has learned from your progress so far against any new data. That means you can keep moving as new data comes in, instead of starting over.
Is there an obligation to divulge the use of predictive coding in responding to a second request?
Ignatius: The DOJ is pretty clear on this point when issuing requests for additional information and documentary material:
If the company or its agent uses or intends to use software or technology to identify or eliminate potentially responsive documents and information produced in response to this Request, including but not limited to search terms, predictive coding, near-deduplication, deduplication, and email threading, the company must provide a detailed description of the method(s) used to conduct all or any part of the search. [...] The Department strongly recommends that the company provide these items prior to conducting its collection of potentially responsive information and consult with the Department to avoid omissions that would cause the company's response to be deemed deficient. (http://www.justice.gov/atr/public/242694.htm)
This stance may differ among other regulatory agencies, but I would always carefully review their specifications, subpoena, and any other documents setting forth guidance.
How does the withholding of privileged documents play into a production for a second request, if at all?
John: Attorney work product and attorney-client privileged communications may be withheld from the document production to protect those analyses and communications from disclosure and allow counsel and the parties to freely evaluate the legal issues presented without concern for compromising their position. When documents are withheld on this basis, the parties are required to provide a privilege log identifying the withheld documents.
Greg Houston is a member of kCura’s advice team, providing guidance on customized e-discovery workflows and best practices. He has more than 12 years of experience in litigation support and e-discovery.