by Jacob Cross on April 19, 2017
Originally published more than a year ago, this post is a helpful look at one of the easiest wins in all of e-discovery analytics. We've updated it to reflect the current capabilities of Relativity, and are republishing it to give you another look.
According to the Radicati Group, the average businessperson sends 36 emails per day. It may not sound like much, but if you do the math, it means a single employee creates nearly 10,000 emails per year (more if they work weekends).
With these numbers, it’s no wonder email is the dominant data form in e-discovery. Even in cases involving only a few custodians, you’re looking at a data set filled with thousands of emails, one for every time your custodian hits send.
Yet somehow, email threading still isn’t used in every case—which is shocking to us, given its two huge benefits. First, it prevents your team from reviewing information multiple times; second, and even more importantly, it reduces the likelihood of coding mistakes.
Avoid Déjà vu
Email threading identifies email relationships—threads, people involved in a conversation, attachments, and duplicate emails—and groups them together so you can view them as one coherent conversation.
First, text analytics will identify which of the documents in your data set are emails, then look for embedded messages within those emails. For example, if Rick writes Daryl an email, Daryl's reply will likely contain Rick’s original message at the bottom.
These are called segments. Rick’s email, if it were part of an e-discovery collection, would have two segments.
An algorithm compares and matches segments, grouping emails and attachments from the same conversation—known as a family—together into a neat thread. Next, the technology analyzes the text, sent time, attachments (and their text), and the sender of each email to determine uniqueness or inclusiveness.
Inclusive email messages contain the most complete content—all the text and attachments in a whole email family. Conversely, non-inclusive emails are those with text and attachments that are contained in another (inclusive) email. In other words, if a user reviews only the inclusive messages, they will have read all content in the email family.
By reviewing inclusive messages, rather than non-inclusive messages, your team bypasses redundant content, reducing the number of documents they need to review.
Give Your Reviewers the Complete Picture
As you can verify in your own inbox, emails can have tens—sometimes even hundreds—of segments, some of which are potentially important to your investigation (“Want to commit fraud with me?”) and others that are just fluff (“Thanks for your help!”).
When you don’t use email threading, you’re setting your reviewers up to only see portions of those long chains—sometimes just the “fluff,” even though there may be more to the story.
For example, say there are 10 messages in an email family. If these 10 messages were batched to reviewers without using email threading, they would be mixed into the data set as separate messages with no particular order or grouping.
Not only does this open the door to reviewing the first message 10 times—once on its own, once as a segment in the second message, once as a segment in the third message, and so on—but it also increases the chance that your reviewer will miss potentially responsive information because they’re only getting part of a larger conversation, making it difficult to make an accurate coding decision.
For example, how would you code the following exchange?
If you were a reviewer, you’d likely code this email as non-responsive, as it doesn’t contain any relevant information. But, had your team used email threading, you would have noticed that there's more to this conversation.
First, you would see there are three emails in the family:
Then, you would choose to review the inclusive message to ensure you're getting the whole conversation as you would in your email client:
If pricing is important to your case, this email is definitely responsive.
Another benefit of email threading is the ability to see the organization and coding status of these conversations at a glance. With email threading visualization, as you drill into each piece of a thread, you're able to see an illustration of where it lives in the chain. So, for this final email about setting up a pricing conversation, you can see that it's inclusive (fully shaded) and has been marked as responsive (its box is blue).
So, in addition to avoiding duplicative reviews of the same email, you're able to perform a quick QC as you're reviewing a document—verifying which component of a conversation you're looking at, and getting insight into any coding decisions that have been made so far.
Email threading is right for every case because it makes review easier, more efficient, and more accurate—every case, regardless of how big or small, can benefit from it.
Jacob Cross is a member of kCura’s customer success team, where he helps Relativity users make the most of the platform. He has worked in the e-discovery industry since 2007, helping clients use technology to increase productivity and reduce overall review time.