Your single source for new lessons on legal technology, e-discovery, and the people innovating behind the scenes.

Using Seed Documents during Computer-assisted Review

Constantine Pappas

When it comes to transparency and quality control, seed documents are becoming a larger part of the conversation around computer-assisted review. Seeds are coded by human reviewers and used as examples to train a computer on a project’s categories. These seeds may be judgmentally selected by a case team or randomly selected via statistical sampling.

It is important to submit strong examples so the engine can best understand each category. Seed documents submitted with an incorrect designation can cause the engine to make incorrect decisions on other documents, potentially harming the outcome of the project. For this reason, if there is any doubt about the category of a document, a user should refrain from submitting it as an example.

Given those considerations, the talk centers around two main concerns: when to use seeds to train the engine, and how to ensure they’re encouraging the proper designations. To help address these considerations, Relativity Assisted Review was designed with built-in transparency and flexible options for using seeds.

During development for Assisted Review in Relativity 8, we sought to make working with seed documents even more flexible. After hearing clients’ most common workflows for seed documents, we added new round types to help support Assisted Review projects at multiple stages. The round type helps admins determine how Relativity will work with manually coded documents. Assisted Review offers several options:

Training
Documents that are human coded will be automatically submitted to the engine as seeds once the administrator selects Categorize at the end of the round.

Quality control
Documents that are human coded will be automatically submitted to the engine as seeds once the administrator selects Categorize at the end of the round. Admins can choose to not categorize if additional seeding is not necessary.

Pre-coded seeds
Documents coded outside of the Assisted Review workflow will be submitted to engine as seeds.

Control set
To preserve the statistical calculations for precision, recall, and F1, documents associated with a control set will not qualify as seeds.

Additionally, visibility into a project’s seeds is carefully tracked in Relativity. All documents submitted as seeds are recorded as example documents in the categorization set. For every document that has been categorized, Relativity tracks which seed document was responsible for the engine’s decision. In the case of an overturn or mistake, this allows for an easy workflow to navigate between a seed document and its categorization descendants to properly quality control not just the computer judgments, but also the human decisions that drive them.

Though each case requires a unique approach to seed documents and quality control, Assisted Review is designed to offer a flexible, powerful workflow that delivers transparency at each stage. If you have any questions about making the most of Assisted Review, please don’t hesitate to contact us at advice@kcura.com.


Constantine Pappas is a licensed attorney with more than 15 years of legal experience. He has served as in-house counsel and managed both paper and electronic discovery for large-scale lawsuits and government investigations. As a member of Relativity’s customer success team, Constantine helps Relativity users with workflows for text analytics and computer-assisted review.

The latest insights, trends, and spotlights — directly to your inbox.

The Relativity Blog covers the latest in legal tech and compliance, professional development topics, and spotlights on the many bright minds in our space. Subscribe today to learn something new, stay ahead of emerging tech, and up-level your career.

Interested in being one of our authors? Learn more about how to contribute to The Relativity Blog.