In my last article, I spoke about extracting tweet data from Twitter, filtering it and cleaning it up. Today, I want to look at sentiment analysis of comments / replies to tweets using the API.
For this, I am going to use Vader Sentiment, which is a library specifically tuned for social media comments. We can install & initialise it as below.
In the below, I loop through the list of extracted Tweet ID’s. From there, we pass that ID into the API and pull back comments. I then run that through the function I’ve called ‘inner’, which produces the sentiment analysis for each reply.
We then use the extract_compound_sentiment function to extract only the compound score from the sentiment output.
Finally, we calculate the mean of all those sentiment scores. This gives us a compound view of the replies. We will now have the tweet ID | Compound score in a single dataframe.
Using this super quick method, you need to be careful as you may hit the rate limits on the Twitter API. If you’re looking at using streaming data from specific Twitter users, I would think that you’ll be OK. If you are pulling back a batch of tweets and pulling back all of their comments you may start reaching the rate limits.
We can get around that issue by taking a different route to assess sentiment. We could use the requests module in Python to extract the full article text. Here, we use the same sentiment analysis library as we use above but we run it on the entire article text, which we couldn’t extract from Twitter directly, so we need to scrape it from the BBC website.
The article I had chosen is below and gets a polarity of -0.8, which is a highly negative sentiment. So it works well!
By coupling the sentiment of the tweet in addition to the sentiment of the entire article text, we can get a good picture of sentiment of the news article.