A Hybrid Framework for Fake News Detection: Integrating Ensemble Machine Learning with Real-Time API Verification
DOI:
https://doi.org/10.63163/jpehss.v3i4.1218Keywords:
Fake News Detection, Misinformation, Ensemble Learning, Machine Learning, API Integration, Hybrid Systems, Fact-CheckingAbstract
The proliferation of misinformation in digital ecosystems presents a critical threat to informed public discourse. To address the limitations of purely data-driven or manual ver- ification methods, this paper proposes a novel hybrid fake news detection system. The framework synergistically combines an ensemble of five machine learning models—Logistic Regression, Decision Tree, Gradient Boosting, Random Forest, and K-Nearest Neighbors—with real-time external verification via Google’s Cus- tom Search and Fact Check APIs. The models were trained and evaluated on a large, publicly available dataset of 37,928 labeled news articles (Fake: 17,903, Real: 20,026) from Kaggle, employing a 70-15-15 split for training, validation, and testing. A modular web application, built with Flask and a responsive frontend, facilitates both single-article and batch analysis. Experimental results demonstrate robust individual model performance, with test accuracies ranging from 84.31% to 99.72%. The integrated system introduces a dynamic decision layer: when the ensemble’s prediction confidence falls below an 85% threshold, it automat- ically queries external APIs to cross-reference claims against trusted sources and established fact-checking databases. This hybrid approach mitigates the blind spots of static machine learning models when confronted with novel misinformation or evolving news narratives. The system offers a practical, scalable, and more reliable detection mechanism, achieving an effective balance between computational efficiency and verification rigor. This work underscores the value of hybrid methodologies in the ongoing effort to preserve information integrity online.