Benchmarking the Generalization and Robustness of AI Models for Fake News Detection Across Political, Business, and Health Domains
Faculty Mentor
Nicolas Gallo
Major/Area of Research
Artificial Intelligence, Computer Science
Description
INTRODUCTION: With the widespread circulation of misinformation and deliberate information manipulation in the digital landscape, it has become increasingly challenging to distinguish between authentic material and fabricated content across social media and news platforms, invalidating the credibility of public knowledge. This research presents a principled benchmark comparing the performance of three advanced machine learning models and a combination ensemble model in detecting fake news in Politics, Business, and Health.
METHOD: Using a curated, balanced corpus of 20,000–30,000 fact-checked news articles sourced from public datasets from Politics, Business, and Health, multiple AI models – two discriminative transformative models (DeBERTa, ELECTRA), a generative LLM model (Falcon-7B Instruct), and a combination of the transformative-LLM architectures – will be trained to detect heterogeneous misinformation patterns. To compare their strengths and weaknesses, model performances will be assessed in terms of Accuracy, Macro-Precision, Macro-Recall, Macro-F1, and ROC-AUC under three evaluation criteria: in-domain accuracy, reliability in cross-domain shifts, and adaptability when input text is modified.
RESULTS: By means of rigorous statistical evaluation and testing, this research aims to report label distribution within the curated corpus while providing a comprehensive quantitative benchmark of performance under different evaluation criteria. The results will compare overall accuracy, domain-specific versus cross-domain performance, and model robustness when exposed to new unseen contexts and subjected to controlled text modifications.
DISCUSSION/CONCLUSION: Overall, the benchmark is designed to identify which architectures are most dependable for misinformation detection in realistic settings where topics and wording change. This research seeks to contribute to the development of more reliable, socially responsible, and transparent AI-driven misinformation detection mechanisms for maintaining the integrity of information ecosystems.
Benchmarking the Generalization and Robustness of AI Models for Fake News Detection Across Political, Business, and Health Domains
INTRODUCTION: With the widespread circulation of misinformation and deliberate information manipulation in the digital landscape, it has become increasingly challenging to distinguish between authentic material and fabricated content across social media and news platforms, invalidating the credibility of public knowledge. This research presents a principled benchmark comparing the performance of three advanced machine learning models and a combination ensemble model in detecting fake news in Politics, Business, and Health.
METHOD: Using a curated, balanced corpus of 20,000–30,000 fact-checked news articles sourced from public datasets from Politics, Business, and Health, multiple AI models – two discriminative transformative models (DeBERTa, ELECTRA), a generative LLM model (Falcon-7B Instruct), and a combination of the transformative-LLM architectures – will be trained to detect heterogeneous misinformation patterns. To compare their strengths and weaknesses, model performances will be assessed in terms of Accuracy, Macro-Precision, Macro-Recall, Macro-F1, and ROC-AUC under three evaluation criteria: in-domain accuracy, reliability in cross-domain shifts, and adaptability when input text is modified.
RESULTS: By means of rigorous statistical evaluation and testing, this research aims to report label distribution within the curated corpus while providing a comprehensive quantitative benchmark of performance under different evaluation criteria. The results will compare overall accuracy, domain-specific versus cross-domain performance, and model robustness when exposed to new unseen contexts and subjected to controlled text modifications.
DISCUSSION/CONCLUSION: Overall, the benchmark is designed to identify which architectures are most dependable for misinformation detection in realistic settings where topics and wording change. This research seeks to contribute to the development of more reliable, socially responsible, and transparent AI-driven misinformation detection mechanisms for maintaining the integrity of information ecosystems.