Google’s Ngram Viewer shows that the word cloud is used twice as often in today’s books than it was at the turn of the century. A friend recently referred to cloud proliferation as never-ending, which feels so true. It seems that, for every new data source open to discovery, two more are waiting in the wings. And while this dramatic change is already a tired topic to write about, the implications are still very real for RelativityOne users.
Across segments—even and especially in the niche of e-discovery—the cloud simultaneously enables and complicates the work we all do every day.
Hoping to gather insights on how Relativity is helping our users tackle new data sources, I recently sat down with Kishore Rao, a software engineering manager at Relativity who leads a team developing Collect for RelativityOne. (Fun fact: Around here, this group is better known by their team name, 1 (800) COLLECT.)
Initially, I simply wanted to peek inside the room where the engineering magic happens. To get some first-hand perspectives on just how our engineers at Relativity take an idea—like building a new integration with Google Vault, Google’s application to export its discoverable data—from concept to the real world.
What we talked about instead were decisions. And time. (And let’s be real, half the meeting I blabbed about my department, seeking ways to understand Kishore’s experience in the context of my own.)
Why Are Data Collection Integrations Helpful?
Collect for RelativityOne was engineered to help companies, and the law firms and providers who support them, to streamline collections while keeping all data in a single, secure, end-to-end platform. The goal is to help you collect, process, and review data all from the same place—no jumping between programs or local downloads required.
With this latest integration, the story for our user community is pretty straightforward: collecting Google data just got a whole lot easier.
So what makes adding data sources like this a unique effort for our engineers?
Google, like all technology proprietors, makes proprietary decisions about the format in which data is exported from their systems. With numerous cloud services making format decisions of their own, no two exports are the same. There’s no one-size-fits-all compatibility plug-in to simplify this process, so different data types will look different in a review tool’s front end without some compensating on the back end.
Without a purpose-built integration to do the back-end adjusting for each format, for the people in charge of taking that export and preparing it for human eyes are left to do the work. Often this entails tasks like normalizing metadata—but even the small things quickly compound, adding hours to the clock.
The Unique Challenges of Google Vault Data
For these reasons, data collected through Google Vault is one of a kind. Here’s what makes it uniquely challenging to collect for e-discovery and other data review requirements.
Missing Metadata
A particularly noteworthy consideration is that metadata is not always stored on the file level of documents exported with Google Vault. For example, information related to file folders is exported in the form of tags. This can cause confusion when the resulting documents end up in review queues downstream.
When ingesting this data into a review platform, technical teams need to figure out a way to reassociate this information with the documents in question. For companies processing multiple downloads from Google Workspace, they often need to stand up an ancillary solution to reliably perform this step—bringing another tool and more data exchanges into the mix. That isn’t ideal.
Fortunately, for RelativityOne users with the latest, Google-compatible version of Collect, this is done automatically.
Short Messages
Short message data are on the rise. Employees increasingly use text messages or one (or more) of the many available enterprise chat platforms to exchange information and have important discussions with colleagues. Google Chat is just one of their options—and if you’re using Google Workspace, it’s very likely you have some of this data in your stores.
However, the challenge with short message data is that restructuring stored conversations to be more easily reviewed by investigators or legal teams is difficult. Default or legacy solutions and settings may render it virtually illegible (often as JSON files full of code).
However, when collecting Google Chat data, Relativity automatically converts the chats into Relativity’s short message format so that reviewers can search and see conversations as they appear in their native applications—making for much more intuitive and insightful reviewing.
Be Prepared to Tread Carefully
Another layer of complexity in discovering diverse data types is defensibility. In the event of litigation, parties are required to demonstrate defensible collections of potentially relevant data, answering questions and providing documentation about when and how the data was gathered. Failing to responsibly perform this crucial step in the e-discovery process may require repeating the process, or result in other penalties and even sanctions.
Using collection tools without seamless integrations for all of the data types in your collection will require a detailed understanding and documentation of each step in your process. So not just where data comes from, but how it’s forensically captured, how it’s normalized to display nicely in your review tool, where it’s stored in the meantime, and so forth.
If you’re connecting RelativityOne to Google Vault, the system will create a forensic fingerprint during collection and generate a manifest you can produce to requesting parties to demonstrate a defensible collection. This is an essential piece of documentation, but it’s also just one step toward a defensible e-discovery strategy. Make sure you’re also carefully considering your collection parameters and communicating those as needed as well.
Make the Call
Yes, the proliferation of the cloud continues to challenge teams on how data is collected before analysis and review. It can seem never-ending and practitioners just can’t predict which new data source is going to throw the department for a loop next. Instead, they need to be agile and ready to accommodate whatever pops up—and they need the right tools to help them do it.
My meeting with Kishore helped to drive home the point that every organization, every department, and every team makes decisions with significant downstream effects on yet more teams, departments, and organizations.
As a result, Google Vault data doesn’t come out fully baked for e-discovery, and organizations must then divert their own time to make it fully discoverable. But, on the plus side, the engineers from 1 (800) COLLECT decided to help make it easier.
They focus on building a forward-thinking product that eliminates setup time between steps, proactively develops new integrations as new data sources become more prolific, and always keeps the results defensible. “We want our users to forget they ever feared a collection,” as Kishore put it for me.
(P.S. – Toward the end of our discussion, I asked Kishore what he wanted to tell the Relativity community about the work he and his team are doing. He wanted to get across how seriously the team takes the quality of the product, its defensibility, and that customer feedback only helps improve it.)