Conclusion

In this project, we provide a quantitative and comparative study of existing techniques for investigating the happiness scores by state including the sentiment analysis of tweets, machine learning, clustering, and prediction analysis. By pulling tweets from the Twitter Platform, we were able to leverage the real-time sentiment of individuals by U.S. State to gauge the happiness of the population. Through the different analyses conducted, it was determined that there is a significant difference in the happiness levels across U.S. states, and there is a relationship between the sentiment of tweets and the happiness score of states.

It is also interesting to note that recent years have witnessed an increased research interest in analyzing tweets for their expression of sentiment. This interest is a result of the large amount of messages that are posted everyday in Twitter and that contain valuable information for the public mood for a number of different topics. More specifically, researchers acknowledge that the Twitter users appeared to reflect their life satisfaction, freely expressing both their positive and negative emotions, through their tweets. Our approach to sentiment analysis has increased sensitivity, accounting for tweets with different level of sentiment (scaled from very positive to very negative), resulting in a more accurate identification of the category. In investigating these emotional expressions posted by Twitter Users for their level of happiness, we were able to see a relationship with the overall happiness of the population, as reported in the Happiest States Data.