First I went online and searched for some data I could use to visualize the Eurovision song contest. Luckily I’m not the only fan out there, and I found historical data here. I took some of the data ranging from 2010 to 2016 and of course of my home country the Netherlands.
First of all, It is highly recommended to go to full screen mode with the arrows in the right-bottom corner to play with the dashboard below
On the first page of the dashboard you see a map of the world (why is Australia a contester?!) and the average points won by these countries over the years 2010 to 2016, assuming you didn’t set any filters. To visualise this, the first bump on the road had to be concurred: the Esri map doesn’t work yet if you embed your Power BI dashboard into a web page, this is why we used the shape map for this blog. I am sure that the Power BI team will get rid of this problem soon, because the Esri map has many benefits over the Bing maps.
When no filters are used, the table ‘final contesters’ shows an overview of the year, country, place, points, start number, performer and song. To refresh your memory, I’ve added links to search YouTube on “Eurovision”, the name of the performer and the song title. The hyperlinks to search are easily created by a simple DAX formula, but makes the dashboard a lot more informative and fun. It is highly recommended to go to full screen modus with the arrows in the right-bottom corner
For me, the most insightful thing in the dashboard is the bar chart on the bottom, showing the average amount of points received by starting number. It is clear to see that the acts that are starting earlier score much lower on average points than the acts starting last. Since the starting position of the Netherlands was 6th, it is easy to find a reason why we did not win last weekend ;).
You can see The Dax Formula that we used to create the youtube URL above
The first page of the dashboard shows us some facts, but the second page shows us the opinions found on twitter. To create this dashboard I’ve used R studio to extract the tweets and do the analysis. Next, I’ve saved the results to SQL Server, and thereafter visualised it in Power BI. This way I was able to combine the flexibility and statistical power of R and the graphically strong visualisations of Power BI.
To scrape the tweets from Twitter in R I’ve used the package twitteR. To use the search Twitter functionality, you need to have an API access token (which you can retrieve here) 5000 tweets were scraped, from which 1011 tweets remained after deleting the retweets.
To identify positive and negative words I’ve used the dictionaries from this website. I was not completely happy about the list, so I’ve added some Eurovision-specific words, like ‘vote’. The tweets had to be cleaned first to do the sentiment analysis. For example the #’s and @’s had to be deleted. As can be seen in the word cloud on the second page of the dashboard, the cleaning was not that neatly, since some noise is still present.
After the (not so neatly) cleaning, a score was calculated by adding one point for each positive word found in a tweet and subtracting one point for each negative word found. The score for each tweet was saved to a SQL Server database (using this RODBC package) , to access it more easily from Power BI.
In Power BI, the word cloud shows the most occurring words in the tweets. On the bottom of the page, a bar graph shows how many positive/negative tweets were found (amount of tweets by score). Most of them (689) have a score of 0, meaning neither positive nor negative. When clicking on that bar, the word cloud changes to represent the tweets with a score of 0. By clicking the bar again, the filter will reset.
The tweets where searched by using #NED and #Eurovision. Therefore, when you click on one of those two words, the filtering contains all the records. When clicking on one of the other words however, you can see in the bar graph on the bottom of the report how many times this word was present in tweets.
Well, I’ve extended the fun of the Eurovision song contest to a maximum. When you have any questions or would like to receive the R code used for scraping tweets and the sentiment analysis, please feel free to contact us.
Bonus: My favourite of 2017: