Easy and quick information diffusion on the web and especially in social media (i.e.,
Facebook, Twitter, etc.) has been rapidly proliferating during the past decades. As information is posted without any kind of verification of its veracity, fake news has become a
problem of great influence in our information driven society. With the current rate of
news generated in social media, the differentiation between real and fake news has become challenging. Thus, to mitigate the consequences of fake news and its propagation,
considerable research has been conducted both by the academia and the industry, to create
automated approaches to detect malicious content. A plethora of approaches have been
investigated, most of which identify patterns on fake news after they are already disseminated. The need for early detection methods is crucial.
The goal of this thesis is to review the current approaches for detecting disinformation
and propose an effective framework that utilizes only the text features of the news, without using any other related metadata. Several Machine Learning models and Natural Language Processing techniques have been used during experimentation. The findings reveal
that a combination of linguistic features and text-based word vector representations
through ensemble methods can predict fake news with high accuracy.
Collections
Show Collections