In honor of our first week of Spring Geekly—which celebrates (or condemns) novel mashups in the worlds of branding, marketing, and entertainment—we set out to find the best restaurants in Austin that combine multiple cuisines. Our definition of best? The restaurants with the most positively passionate fans, of course!
By now, we know the shortfalls of Yelp: opaque algorithms, too many choices, and thousands of reviews to read through—in the case of best fusion restaurants, 36,265 reviews to be exact. Yelp’s default top 10 list even includes a restaurant with only 3.5 average stars! Take a look:
Now, if you live in or have visited Austin, you likely recognize some names on this list: Chi’Lantro, in particular, is famous for appearing on Shark Tank. But are these the spots that fans love the most? VADER says no.
VADER (Valence Aware Dictionary and Sentiment Reasoner) is a machine learning tool we’ve been using recently for a number of clients to conduct sentiment analysis. What’s sentiment analysis? Well, technically, it’s a way of turning qualitative data (reviews) into quantitative data (sentiment scores) to measure the extent to which a reviewer is writing from a positive or negative emotional state, as well as the extent to which the words they use are positive or negative.
In layman's terms: it’s a way to tell how positive or negative a review is, and how passionate that person is.
The exciting thing is that VADER learned to judge sentiment from ALL OF TWITTER as well as a large number of human raters. This helps eliminate bias by relying on “the wisdom of the crowd”: collective opinion is more trustworthy than individual opinion. In fact, in a large study, VADER not only out-performed 11 other models, but it even beat individual human raters. Oh, and VADER can be deployed to analyze massive amounts of data really quickly.
How we used VADER to find the best Austin fusion restaurants
As always, the first step is to collect the raw review data. We scraped every review from the 114 restaurant results that appeared in our Yelp search, which got us a whopping 36,265 reviews. Next, we culled the data to even out sample sizes and make sure each restaurant had enough reviews for VADER to devour. That left 87 restaurants, each with more than 100 reviews.
Then VADER went to work: each review was analyzed and assigned sentiment scores. The outputs of this technique are 1) a measure of sentiment intensity of the review with a range from -100% to +100%; and 2) a percentage breakdown of the actual sentiment based on the words used (positive, neutral, or negative).
For example, “BEST FUSION MEAL EVER!!” scores high on positive intensity, because VADER takes into account things like capitalization and punctuation. However, only 25% of the words in the review are strictly positive (“best”). On the other hand, “Hanabi is amazing, delicious and divine sushi” is less intense, but has a higher proportion of positive words (“amazing,” “delicious,” and “divine”).
Once VADER translated qualitative reviews into quantitative scores, we could work with the data.
Combining intensity and positivity
By multiplying the positive intensity by the percent positive words, and the negative intensity by the percent negative words, we derive what we call a “Passion Score” (PS). The PS—on a scale of -100 to +100 with 100 being the most positively passionate—gives us a holistic idea of the true sentiment behind a review, providing equal weight to the words a person is using AND how passionate they are about what they’re saying.
Using Passion Score, here are the top 10 fusion restaurants in Austin with the most positively passionate fans:
As you can see, using passion score gives us a very different list than Yelp’s default top 10—only two restaurants make both lists: Koriente and Jenna’s Asian Kitchen.
Looking at the whole picture
Average passion score helps us rank our top 10, but when you visualize all of the passion scores for each individual review, you can learn a lot more. Take a look at this lovely box and whiskers plot, which shows every review (the dots), the average of the reviews (the tall line in the middle), where 50% of the reviews is concentrated (the box), and the bounds for any outliers (the whiskers to the left and right).
You can think of the lines as a tug of war of passion, with negative reviews on the left and positive reviews on the right. The more spread out the data is, the more opposing opinions there are. For example, while the top three restaurants have the highest averages, they also have some of the most spread data, meaning that there are a lot of people on both sides of the passion spectrum. Dragonbeard Kitchen fans, on the other hand, all feel similarly positive about the restaurant, just not as positive as those of the top 3.
So whose fans are the most positively passionate?
Congratulations, Hanabi! Based on our analysis and assessment of 36,000+ Yelp reviews, Hanabi Sushi has the most positively passion fans. Your second location, Hanabi Ramen, made the top 3, too! A great happy hour might have something to do with it...
Want to learn more about this project or have an idea about sentiment analysis applied to your work? Contact us.