e-Discovery and the Oscars: Can Analytics Pick the Best Picture?

Late winter isn’t everyone’s favorite time of year—but for movie lovers, February is a month to cherish as the year’s best films are showcased in a cavalcade of awards ceremonies, culminating with the Oscars® ceremony this Sunday. 

Unfortunately, with so many great films to see, it’s sometimes tricky to catch them all before the awards are handed out. That presents quite the dilemma—with eight Best Picture nominees alone, how do you choose which films are most deserving of your time and money this weekend? And perhaps more importantly, which one is going to walk away the big winner?

As both movie geeks and e-discovery nerds, pondering this question led us to wonder: could analytics help? We decided to put this to the test.

Setting the Stage

To leverage analytics, we needed some data on each film, and the movie review website Rotten Tomatoes proved to be a perfect source. To start, we copied and pasted 100 different critics’ reviews of each nominated film into individual text files, then quickly ingested these files into a review workspace via Relativity Processing.

The next step was building an analytics index, which breaks down each document into individual terms and maps them into a concept space that allows the data to be examined and analyzed not simply by shared terms within documents, but by how closely those documents are related based on their conceptual meaning.

But why would conceptual similarity be helpful here? After all, wouldn’t the reviews of each movie be conceptually similar to other reviews of the same film?

That brought us to our “aha” moment—we could find reviews of our all-time favorite movies and compare them against all the reviews in our analytics index. This (we hoped) would tell us which Best Picture nominees matched up with films we already know and love.

At this point, the process was about as simple as it gets—and also quite a bit of fun. We looked at past reviews by one of our heroes of the film world, Roger Ebert, then copied the text into the Concepts box at the top of the document list in Relativity and clicked Search. We started with “Best Picture” winners from the past ten years.

Concept Searching with Movie Reviews

Click the screenshot for a larger image.

And the Oscar® Goes To …

The Revenant … with some caveats.

Concept search doesn’t lend itself to predictions, and conceptual similarity doesn’t necessarily measure quality. However, from our results, we can conclude that The Revenant­—out of all eight nominated films—shares the most conceptual similarity to the decade’s prior Best Picture winners (at least according to the 10 reviews we used for our concept search).

Room also landed towards the top in our experiment—but, based on what film critics are predicting, we’re sticking with The Revenant for our pick.

So that’s how this year’s nominees stack up against other Academy favorites. But what about our favorite films? Here’s what we found:

Are you a sucker for a good love story? Ebert’s review of The Notebook brings back results dominated by the film Brooklyn.

Fan of action? A review of Die Hard shows Mad Max: Fury Road and The Revenant as most conceptually similar—but Iron Man is surprisingly more closely related to Room, Brooklyn, and The Martian.

The Shawshank Redemption, a modern classic, stacks up best with Room on a conceptual level.

On the comedy side, fans of Office Space might not be surprised at the conceptual similarity to reviews of The Big Short. But entering the review of Legally Blonde shows a closer concept match to Brooklyn and Bridge of Spies.

Finally, how could we leave out Star Wars? While similarity to The Martian and Mad Max: Fury Road was expected, it was a minor shock to see that some reviews of Room, Bridge of Spies, and Brooklyn were conceptually related as well. Upon closer inspection, elements of intrigue or suspense, heroes and villains, and family relationships are referenced in reviews of each these films, which helps explain their conceptual similarity.

This illustrates the beauty of conceptual searching. While other forms of searching and analysis often bring back expected results based on words or phrases you know you’re looking for, using conceptual analytics helps you find connections you aren’t looking for—which can make a world of difference when you’re performing any type of investigation.

So what did we learn here? Well, we certainly have a better idea of how we’re spending our weekend.

But more importantly, we saw how the power of analytics can transcend typical review workflows to provide helpful and unexpected insights. You probably won’t start using your e-discovery software to pick your weekend flicks, but it certainly raises the question—what else can analytics do?

Do you have a favorite film you’d like us to compare against our Best Picture review index? Let us know in the comments.

Peter Fogarty is an instructional design lead on kCura’s education team, focused on developing and delivering educational materials for Relativity users, including in-person trainings, webinars, and interactive tutorials.