The 2018 Oscars are this Sunday, March 4, and there are nine films up for best picture: “Get Out”, “Lady Bird”, “Call Me by Your Name”, “Dunkirk”, “Three Billboards Outside Ebbing, Missouri”, The Shape of Water”, “Phantom Thread”, “The Post”, and “Darkest Hour”. Many are predicting either “Three Billboards” or “The Shape of Water” to take the cake. But is this what the people want?!
We decided to analyze Rotten Tomatoes to find out, with a little help from our friend, VADER.
VADER strikes again
(If you’ve already read our post on using VADER to determine the best coffee shops in Austin, you can skip this section.)
VADER (Valence Aware Dictionary and Sentiment Reasoner) is a machine learning tool we’ve been using recently for a number of clients to conduct sentiment analysis. What’s sentiment analysis? Well, technically, it’s a way of turning qualitative data (reviews) into quantitative data (sentiment scores) to measure the extent to which a reviewer is writing from a positive or negative emotional state, as well as the extent to which the words they use are positive or negative.
In layman's terms: it’s a way to tell how positive or negative a comment is, and how passionate that person is.
The exciting thing is that VADER learned to judge sentiment from ALL OF TWITTER as well as a large number of human raters. This helps eliminate bias by relying on “the wisdom of the crowd”: collective opinion is more trustworthy than individual opinion. In fact, in a large study, VADER not only out-performed 11 other models, but it even beat individual human raters. Oh, and VADER can be deployed to analyze massive amounts of data really quickly.
What Rotten Tomatoes thinks
Putting VADER aside for a second, let’s look at Rotten Tomatoes. From a Tomatometer standpoint, as you can see below, “Get Out” and “Lady Bird” rise to the top. But critic reviews are hardly the same as audience reviews. Plus, Rotten Tomatoes defines the Tomatometer score as “the percentage of approved Tomatometer critics who gave the film a positive review”, which narrows the perspective down even more.
The Rotten Tomatoes audience score is a little closer to measuring the people’s choice. Hundreds of thousands of individuals submit their star rating, and Rotten Tomatoes gives the percent of those who rated the movie 3.5 stars or higher. In this case, audiences and critics agree that “Get Out” is the best picture of the bunch. But does a star rating really capture what people are thinking?
What VADER thinks
Rather than a score based on positive critic reviews, which is biased, or one based on audience star reviews, which is reductive, VADER looks at the whole picture. So we mined the written audience reviews for each nominee, and fed all 6,869 of them into VADER.
Each review was analyzed and assigned sentiment scores. The outputs of this technique are 1) a measure of sentiment intensity of the comment with a range from -100% to +100%; and 2) a percentage breakdown of the actual sentiment based on the words used (positive, neutral, or negative).
For example, “I LOVE THIS MOVIE!!” scores high on positive intensity, because VADER takes into account things like capitalization and punctuation. However, only 25% of the words in the review are strictly positive (“love”). On the other hand, “Love The Post — it’s the best” is less intense, but has a higher proportion of positive words (“love” and “best”).
Very nice, but wouldn’t it be even nicer to have a metric that takes into account both what people say and how they say it?
What the people want
By multiplying the positive intensity by the percent positive words, and the negative intensity by the percent negative words, we developed a metric we’re calling “Passion Score” (PS). The PS—on a scale of -100 to +100 with +100 being the most positively passionate—gives us a holistic idea of the true sentiment behind a review, giving equal weight to the words a person is using AND how intense they are about what they’re saying.
Using PS we can definitively rank the nominees according to what the people want:
Do averages lie?
Looking at the passion score results, you might wonder, like we did, “were people really that ‘meh’ about ‘Dunkirk’?” It’s an understandable reading of the average, but not quite true. We don’t think averages lie, per sé, but they rarely tell the full story. That’s why we decided to visualize all of the passion scores for each individual review using this lovely box and whiskers plot:
What this all means
Based on VADER’s analysis of Rotten Tomato audience reviews, and our passion scoring, “Call Me by Your Name” has the highest percentage of positively passionate reviews. That’s why, in the graph above, the average and interquartile range (box) is the furthest to the right out of any movie. “Dunkirk,” on the other hand, has a roughly equal amount of positively passionate reviews as negatively passionate reviews, making the average closer to zero.
You can think of it as a tug of war of passion, with those who dislike the movie on the left, and those who like it on the right. And the more spread the data is, the more opposing opinions: “Three Billboards,” for example, is the most divisive movie.
What does it mean for Sunday’s Oscars? It’s unlikely that the people’s choice is going to win best picture—but no one saw “Moonlight” winning, either!
Want to learn more about this project or have an idea about sentiment analysis applied to your work? Contact us.