A Comparative Analysis of Machine Learning Models for Diabetes Prediction and Early Diagnosis

Authors

  • Sohaib Latif Department of Computer Science and Software Engineering, Grand Asian University, Sialkot, 51310, Pakistan. sohaib.latif@gaus.edu.pk
  • Daniyal Affandi Department of Computer Science, The University of Chenab, Gujrat, 50700, Pakistan. affandidaniyal305@gmail.com
  • Sadia Karim Department of Computer Science, The University of Chenab, Gujrat, 50700, Pakistan. sadiamobeen92@gmail.com

DOI:

https://doi.org/10.63163/jpehss.v3i1.154

Keywords:

Diabetes Prediction, Machine Learning, Healthcare AI, Data-Driven Healthcare, Early Diagnosis

Abstract

Diabetes is one of the most widespread and rapidly growing chronic diseases globally, affecting millions of people and posing serious health risks if not diagnosed early. Fortunately, with advancements in technology and the power of machine learning (ML), it is now possible to analyze patient data and predict the likelihood of diabetes with remarkable accuracy. This paper explores the use of five machine learning models—Random Forest, Logistic Regression, Decision Tree, Support Vector Classifier (SVC), and K-Nearest Neighbors (KNN)—to develop an effective diabetes prediction system. The dataset used for this study, sourced from Kaggle, contains 5,000 patient records, including key health indicators such as glucose levels, blood pressure, BMI, and age. The data was first cleaned, then analyzed, and trained on various ML models, which were evaluated based on accuracy, precision, recall, and F1-score. Among the models tested, the Random Forest classifier demonstrated the best performance, achieving an accuracy of 91.2%, surpassing SVM (88.7%) and Decision Tree (85.4%). These findings highlight the growing role of machine learning in healthcare, showcasing how predictive models can improve early diagnosis, enhance patient management, and support clinical decision-making. By leveraging these ML-driven approaches, healthcare systems can transition from traditional practices to data-driven strategies, ensuring timely interventions and reducing the long-term complications associated with diabetes.

Downloads

Published

2025-03-02

How to Cite

Sohaib Latif, Daniyal Affandi, & Sadia Karim. (2025). A Comparative Analysis of Machine Learning Models for Diabetes Prediction and Early Diagnosis. Physical Education, Health and Social Sciences, 3(1), 10–22. https://doi.org/10.63163/jpehss.v3i1.154