Your single source for new lessons on legal technology, e-discovery, and the people innovating behind the scenes.

What You Need to Know About Collecting Microsoft Teams Data

Matthew Vargo

Recently, the most requested new data source we hear from Collect users is Microsoft Teams. This is a direct result of the rapid adoption Teams has experienced over the past two years.

The COVID-19 pandemic accelerated adoption of Teams as remote work became a necessity; however, even as the global economy continues to recover from the pandemic, adoption of Teams is continuing at a rapid pace. Currently, Microsoft logs nearly 250 million active users each month.

Increasing corporate adoption of Teams has a direct impact on e-discovery professionals tasked with collecting discoverable data from custodians in relation to litigation, investigations, and similar matters. It’s just one example of how proliferating data sources have complicated things in recent years, as new data types and storage locations emerge to throw a wrench in the collection processes of old.

But what makes Teams data, in particular, tricky? Here are five dependencies you need to have in mind if you’re collecting it.

#1: Data Locations Can Be Decentralized

Finding, let alone collection, Teams data is difficult because the data resides in various places and requires multiple workflows or techniques for accessing it all. Once you’ve collected files from these disparate locations, the subsequent aggregation of your custodial data can be time-consuming and error prone.

When on the hunt for Teams data, make sure you consider each of these locations as sources for different data types:

  • Private chats you’ll find in a user's Outlook mailbox are typically stored in the Team Chat folder under Conversation History.
  • Group chats are stored in the Conversation History folder of the group mailbox.
  • Uploaded files are stored in SharePoint, but in different locations based on whether they were shared via private or group chats.
  • “Modern attachments” are documents residing in OneDrive and SharePoint but presented as links in chat; unfortunately, these files are unsupported by most traditional collection tools, which may make manual access necessary.

#2: Defensibility Matters

Defensibility is at the heart of any collection. It includes protecting data against alteration, maintaining metadata, and providing for future analysis of data integrity. When planning to collect Microsoft Teams data, not all tools preserve and produce the details necessary to help legal teams maintain defensibility—a potentially costly mistake if a court discovers discrepancies and requires the producing party to redo their work.

Processes can also disrupt defensibility. Some technologies, by their nature, require collections to be exported locally before they’re loaded into a platform for review. This extra hop introduces the potential for human error or misconduct. Each time data is exported, loaded, and exported again, the chain of custody between document collection and production gets harder to manage.

Often, producing parties are asked to authenticate that a document is its original. The industry standard to prove out authenticity is SHA-256 hashing. According to the Cybersecurity & Infrastructure Security Agency:

A hash function (also called a “hash”) is a fixed-length string of numbers and letters generated from a mathematical algorithm and an arbitrarily sized file such as an email, document, picture, or other type of data. This generated string is unique to the file being hashed and is a one-way function—a computed hash cannot be reversed to find other files that may generate the same hash value.

To foster defensibility, legal teams should ensure the following when collecting Teams data:

  • Metadata is collected alongside each document.
  • Data exports are kept to a minimum.
  • The system automatically creates a SHA-256 hash for each unique document.

#3: Modern Attachments Create Modern Complications

The success of platforms like Microsoft Teams have shifted how people within companies share documents. Instead of attaching documents to email communications, professionals are now likely to share links within private and public channels. In addition to the proliferation of collaboration platforms, employees choose to link documents because it cuts down on the number of versions of a single document, permits multiple people to modify the same document, and allows the document owner to closely control who has access to the file.

In March of this year, a United States District Court for the Southern District of New York decided in Nichols v. Noom Inc. against the need to produce documents from linked sources. United States Magistrate Judge Katherine Parker noted that the issue raised:

… complex questions about what constitutes reasonable search and collection methods in 2021—when older forms of communicating via emails and documents with attachments and footnotes or endnotes are replaced by emails and documents containing hyperlinks to other documents, video, audio, or picture files. It also highlights the changing nature of how documents are stored and should be collected.

#4: Standards Are Still Evolving

The rulebook on how Teams data should be collected, reviewed, and produced is hardly finished. For example, ESI protocols could request that legal teams provide documents linked within chats as part of the family of documents for that thread. And case law is far from deciding, ultimately, the best way to handle chat data.

Many tools today simply provide the link as part of the text collected—not the document itself—potentially missing a big piece of the picture. Manually discovering and linking these separate documents is no small feat, if you’re required to do so.

More modern tools, such as Collect in RelativityOne, preserve the link as well as the actual document from SharePoint. With this capability, you won’t be left with an incomplete picture or a system of disjointed tools cobbled together to get that complete picture.

Additionally, proportionality is a not-insignificant factor in how courts and parties require or request data like Teams chats to be managed during discovery. However, as more legal teams begin to adopt next-generation collection and review tools to streamline this process, costs are likely to go down. This means the issue of proportionality could soon work less in your favor if you’re behind the curve.

#5: Export APIs Are Your Friend

Microsoft recently formalized export APIs for Teams data. While some collection solutions rely on other types of APIs, Microsoft’s new, standard interfaces will quickly become the only way to gather both public and private chats in Teams. Professionals looking to future-proof their collections will want to invest in solutions that utilize the standard API.

As an additional note, the new export API is considered a “protected API,” and you’ll need to get approval from Microsoft to use it for collections. 

Without a properly connected and integration collection tool, another issue inherent to short message data like Teams chats relates to the form, or the container, in which it comes packaged. Currently, Microsoft Teams data is exported as a mess of .PST email files—not very user friendly to review or interpret. Companies waste time organizing each .PST file for review and production when the original data transpired in a threaded conversation.

Investing in an Easy-to-Use Solution

As data sources continue to evolve, and in-house teams expand their expertise across the litigation lifecycle, simplicity, and consistency are king.

Along each phase of the process, from early case assessment to full-blown reviews, legal teams benefit from solutions that automatically convert chat data into a native look. For those using RelativityOne, this is accomplished with Relativity’s short message format.

Investigators, in-house counsel, and internal IT resources don’t have the time or incentives to learn multiple e-discovery tools for each phase of the EDRM. Even if they did, it simply isn’t feasible to learn and use every tool to perfection. Investing in end-to-end tools not only keeps data in a single repository, but also enables the people who use them to develop deep knowledge of the product to develop robust, defensible workflows that meet their unique needs.

Artwork for this article was created by Sarah Vachlon.

Watch a Video To Learn More About Modern Collections for Microsoft Teams


Matthew Vargo is a proud member of the partner marketing team at Relativity, where he tries his damndest to tell the Partner + RelativityOne story. Occasionally, he writes blogs.

The latest insights, trends, and spotlights — directly to your inbox.

The Relativity Blog covers the latest in legal tech and compliance, professional development topics, and spotlights on the many bright minds in our space. Subscribe today to learn something new, stay ahead of emerging tech, and up-level your career.

Interested in being one of our authors? Learn more about how to contribute to The Relativity Blog.