by Sam Bock
on March 28, 2017
Legal & Industry Education
Review & Production
Searching has been at the crux of e-discovery since the inception of modern platforms for digital data review. Before analytics, searching options like Boolean strings and dtSearch were the only ways to parse through large volumes of information and get to the important stuff fast—and now that analytics options are here, these tactics are still an important step in many e-discovery workflows.
Unfortunately, heinous searches have also been at the crux of e-discovery since its earliest days. Many folks from the industry have shared horror stories about overly complex or inclusive searches that turned project timelines upside-down.
But no need to fear: it’s the internet to the rescue (as it so often is). Online resources abound for building better searches that yield more accurate, narrow results. Here are a few we’ve noticed.
This simple website can help you decode what results you might see from a wildcard search.
When it comes to searching, a wildcard enables you to search for all terms with a common root without having to type out every one of those terms individually. For example, let’s say one of your keywords is “apple.” Using an “*” wildcard allows you to search for a root followed by any combination and amount of characters: so “apple*” will return “apple,” “apples,” and “applesauce.”
The wildcard can be dangerous, though. In that same example, if you accidentally put “appl*” into your search, you return all kinds of irrelevant results: “appliance,” “applicant,” or “applaud,” for example.
MoreWords.com can help you test out your wildcards before you use them. It’s not an all-inclusive list—built for word games, the engine will not return proper nouns or words with special characters like hyphens—but it will give you an indication of whether your wildcard might leave things too open to interpretation.
Created to help students and professionals search academic archives, this tool can help you create very simple Boolean search strings.
Boolean logic is used in many search tools, from Google to e-discovery software. It enables you to use operators like “OR,” “NOT,” and “AND” in your searches. So if you’re searching for documents that talk about apples, you might want to say “apples NOT oranges,” to cull out any documents that use the old cliché about comparing two disparate objects.
Via the Search Strategy Builder, you can build simple search strings that use “AND” and “OR” between search terms or phrases. Simply input the concept you want to search for in the grid, and it will generate a search statement for you based on the words you provided.
Although this is a very basic start to what you can do with Boolean logic, it can be a helpful tool for a beginner to practice building search strings, and you can always add to the search statement once it’s been created for you.
If you’re using Relativity, there are plenty of on-demand tutorials in our Training Center to help you get the hang of advanced searching options. Relativity documentation also features workflow recipes with step-by-step instructions.
Say your case involves a search term with a special character, like a percent sign or exclamation point. How in the world do you search for that, knowing these are typically not included in a search engine’s alphabet? You can learn via this video, or by reading this workflow recipe.
Or, let’s say you need to look for a recurring pattern in your data set, but not a specific set of characters—for example, you need to identify any social security numbers so they can be properly redacted, but you don’t have any specific numbers in mind. Regular Expressions (RegEx) can help. Here’s a tutorial on how to build them.
The benefit of tutorials is that you can learn kinetically—they’ll offer some click-through opportunities right inside the software you’re using. Videos—like this one on common dtSearch questions—provide clear visual walkthroughs, and workflow recipes offer step-by-step written instructions on how to get things done in your workspace.
When searching, do what search engine optimization experts do: make sure you’re covering all your bases.
As you build your list of keywords for a review project, it’s important to be just the right amount of inclusive. You want to return everything that’s relevant, but only what’s relevant. So whether you find your list too short or too long, it’s a good idea to compare what you have with what else might be out there.
Using the keyword suggestion tool (specifically under the “Keyword Synonyms” section of your results), you can discover words that are often correlated with your search term and decide if they’re worth adding to your list. For example, the tool tells me that if I’m searching for “Einstein,” I might want to look at the words “brainiac” and “genius,” too.
(Fun fact: with Relativity Analytics, you can accomplish this keyword expansion within your workspace—and get results tailored specifically for the concepts discussed in your data set. Here’s how.)
Also for Relativity users, this resource provides at-a-glance guidance on which searching option is right for your project.
Several engines are built into Relativity, including keyword search, dtSearch, and Lucene Search. When you need to decide which of these tools is the right one for your specific search, you can look at the searching guide to quickly compare what operators and options each engine offers.
Wondering how fuzzy search works in dtSearch versus Lucene? Need to know whether RegEx is an option in a basic keyword search? The guide will give you these answers quickly. Keep a copy at your desk, or take one with you to your next meeting with the partners on your case to ensure you can deliver on their expectations.
Search terms aren’t always by the book (Webster’s book, that is). Sometimes, slang and colloquialisms can be important to a dispute—and you need to know how to interpret them.
Whether it’s the word “gegs” on your original search list or abbreviations like “AAMOF” in your results, it never hurts to have a good pop culture resource on hand to tell you exactly what you’re reading when it comes to doc-to-doc review.
Sam Bock is a member of the marketing communications team at Relativity, and serves as editor of The Relativity Blog.