Search Terms in e-Discovery: "I'm Not Dead"

This post was originally published by D4, an Orange-level Relativity Best in Service Partner. In addition to a great Monty Python reference, it provides some good information about the role and application of search terms in the discovery process.

These lines from Monty Python and the Holy Grail illustrate the persistent debate about the use of search terms in e-discovery:

Collector: Bring out yer dead!

Purveyor: Here's one…

Dead Man: I'm not dead.

Collector: What?

Dead Man: I'm not dead!

Collector: 'Ere, he says he's not dead.

Purveyor: Yes he is.

Dead Man: I'm not.

Collector: He isn't.

Purveyor: Well, he will be soon, he's very ill.

Dead Man: I'm getting better.

Purveyor: No you're not, you'll be stone dead in a moment.

Collector: Well, I can't take him like that. It's against regulations…

Search terms are clearly not dead. No amount of argument will make it so and it’s against regulations just to do with them what you want.

The holy grail of discovery has always been to distill important information from the rest. At this year’s series of Today's General Counsel Institute and in preparation for the predictive coding panel for the Georgetown Advanced eDiscovery Institute, I’ve had the chance to discuss the use of search terms with experts from among outside counsel, Fortune 500 law departments, technology providers, and at least one judge. The discussions pertain to the use of search terms by themselves, and also in conjunction with predictive, conceptual, and structured analytics.

"I’m not dead."

Analytics workflows (predictive coding, technology assisted review, email threading and near-duplication, etc.) have been proven overwhelmingly to save time and money and to improve quality of review. These defensible workflows have garnered the attention and approval of the courts.

The use of conceptual analytics is now the best practice for filtering and review. State-of-the-art analytics tools deliver return on investment for the cost of review. Better pricing and availability have increased their use over the last few years to “routine.” It is now considered best practice to use analytics workflows, such as email threading, near-duplicate detection, and clustering instead of keyword searches as a means to reduce large volumes of ESI and speed up the review process. The use of one or more analytics workflows has become economically feasible for cases of almost any size.

"He will be soon, he’s very ill."

So, search terms clearly are no longer the state-of-the-art when it comes to filtering prior to review. The issues that arise when using search terms played a large role in the 2006 amendments to the FRCP. Though the comments to these rules changes do not mention search terms per se, they are thick with references to searches for relevant ESI. Nothing in the 2010 or 2015 revisions changes that—proportionality is now written into the law. Revised FRCP 26(b) states that the parties may discover “any nonprivileged matter that is relevant to any party's claim or defense and proportional to the needs of the case.” Parties must still search for, preserve and produce according to these rules. So long as most discovery is targeted on the written or spoken word, search terms will be an attractive way to identify or filter out the obvious, but only if used responsibly. In our experience, the use of keywords for filtering, even when used responsibly and expertly, keyword searches still yields a substantial number of non-responsive search hits.

"I can't take him like that. It's against regulations."

Dozens of published opinions cite the use of search terms just over the last few years. The focus of discussion in many of those cases is whether there has been transparency, validation, and testing of search terms, and whether the resulting production complies with the responding party’s obligations.

To maintain defensibility in the use of search terms, heed these warnings:

Judge Facciola sounded an industry warning in US v. O'Keefe 537 F Supp 2d (DDC 2008) that you can't mindlessly apply technology to use search terms. And for judges to opine which search terms may work better than others is "truly to go where angels fear to tread."

Judge Grimm pointed out "the well-known limitations and risks associated with [the use of keywords], and proper selection and implementation obviously involves technical, if not scientific knowledge," in Victor Stanley v. Creative Pipe (250 FRD 251 (D.Md. May 29, 2008).

Judge Peck served "a wake-up call to the Bar...about the need for careful thought, quality control, testing and cooperation...in designing" a search protocol using keywords in Gross Constr. Assoc. v. Am. Mfrs. Mut. Ins 256 FRD 134 (2009). He advised making sure you have initial input from business clients and adequate testing and sampling.

If considering search terms combined with the use of predictive coding or other TAR, consider:

In Kleen Products v. Packaging Corp.,1:10-cv-05711 (W.D Ill.) (d)uring an iterative process, Defendants refined their search terms over several months. The parties effectively met and conferred, and Defendants delivered a substantial production. Plaintiffs were still concerned that Defendants weren't finding everything that they should, and argued that Defendants should be using predictive coding instead. The court declined to order predictive coding and reserved judgment on whether the production was compliant, citing the substantial work done on the search terms.

In Bridgestone Americas, Inc. v. Int. Bus. Machs. Corp., No. 3:13-1196 (M.D. Tenn. July 22, 2014), the court approved use of predictive coding even after screening of the larger collection by use of search terms. The court did not directly rule on validation and testing of search terms, but more amorphously referred to the critical importance of “openness and transparency” between the parties.

In Biomet M2a Magnum Hip Implant Prods. Liab. Litig., No. 3:12-MD-2391 (N.D. Ind. Apr. 18, 2013) ) (“Biomet I”), the court refused to require the responding party to do predictive coding on the entire corpus of its documents, instead allowing it to rely on the keyword searching it used initially to filter the material prior to predictive coding.

"I'm getting better." "No you're not, you'll be stone dead in a moment."

Short of a judicial knock on the head, search terms will stay with us, as a stand-alone method of filtering and in conjunction with other technologies. But they are still difficult and time-consuming to use defensibly compared to more advanced technologies. They require careful development and iterative testing, sampling and revision. The art of filtering and faster and more economic review lies in the use of analytics.