Fake News Classification in 2024 News Articles
Session Number
CMPS(ai) 15
Advisor(s)
Courtland VanDam, MIT Lincoln Laboratory
Discipline
Computer Science
Start Date
17-4-2025 10:45 AM
End Date
17-4-2025 11:00 AM
Abstract
Strong machine learning models for identifying fake news have been developed due to the spread of false information in digital news outlets. Using a labeled dataset, this study investigates how well different classification and embedding strategies can differentiate between fake and authentic news. We compare deep learning designs like convolutional neural networks (CNNs) and transformers with conventional machine learning classifiers like logistic regression, support vector machines, and random forests. In order to evaluate the effects of word embedding techniques on classification performance, we also examine Word2Vec, TF-IDF, and BERT embeddings. According to our findings, transformer-based models—in particular, refined BERT variants— perform better than conventional methods in terms of precision and recall, making better use of contextual semantics. However, lightweight models utilizing TF-IDF with logistic regression provide competitive performance with significantly lower computational costs.
Fake News Classification in 2024 News Articles
Strong machine learning models for identifying fake news have been developed due to the spread of false information in digital news outlets. Using a labeled dataset, this study investigates how well different classification and embedding strategies can differentiate between fake and authentic news. We compare deep learning designs like convolutional neural networks (CNNs) and transformers with conventional machine learning classifiers like logistic regression, support vector machines, and random forests. In order to evaluate the effects of word embedding techniques on classification performance, we also examine Word2Vec, TF-IDF, and BERT embeddings. According to our findings, transformer-based models—in particular, refined BERT variants— perform better than conventional methods in terms of precision and recall, making better use of contextual semantics. However, lightweight models utilizing TF-IDF with logistic regression provide competitive performance with significantly lower computational costs.