The 2016 US Presidential elections 'fake news' scandal made apparent the need for more safeguards on social media to protect people from deceiving the public for personal gain. This study attempts to fulfil this need by building an automated system capable of detecting fake news published during the 2016 US presidential campaign season. For its data analysis, it uses a set of articles flagged as false by Snopes, another set from leading news organisations, and select machine learning algorithms trained to only understand textual content. These models are also the given sentiment-related-features of each article to better predict its factual accuracy.
Highlights:
- The most successful model according to the study is the Long Short Term Memory model (LSTM) which uses the word embedding representation of each article. The LSTM's high average F1-score (0.90) shows that it is adept at discerning between credible and untrustworthy articles.
- The Support Vector Machine model also performed well and scored an overall F1-score of 0.88. However, the addition of the sentiment analysis adversely impacted its performance, whereas the addition of these features improved the performance of the K-Nearest Neighbours System (KNN).
- This study however has limitations which call for further study. The approaches used in this study need to be applied or modified to a broader range of domains to check their effectiveness.