Machine Learning-Based Diabetes Prediction Using Ensemble Methods

Authors

  • Dinesh Bajaj
  • Aditya Kumar

DOI:

https://doi.org/10.5281/ijurd.v1i2.64

Keywords:

Diabetes Prediction, Machine Learning, Ensemble Methods, Pima Dataset, Healthcare Analytics

Abstract

Diabetes mellitus is a chronic metabolic disorder that requires early detection to prevent severe complications and improve patient outcomes. This paper presents a Machine Learning-Based Diabetes Prediction framework using ensemble methods to enhance predictive performance and reliability. The proposed system utilizes clinical datasets such as the Pima Indians Diabetes dataset and applies preprocessing techniques including data normalization, missing value handling, and feature selection to improve data quality. Multiple machine learning algorithms, including Decision Trees, Support Vector Machines, and Logistic Regression, are combined using ensemble strategies such as bagging, boosting, and stacking to achieve improved accuracy. The framework leverages the strengths of individual models while reducing variance and bias. Experimental results demonstrate that the ensemble-based approach outperforms standalone models in terms of accuracy, precision, and recall. Additionally, integration with prior research in healthcare analytics and hybrid learning enhances model robustness and generalization. The study highlights the effectiveness of ensemble learning techniques in developing accurate and scalable diabetes prediction systems suitable for real-world healthcare applications.

Author Biographies

Dinesh Bajaj

Computer Applications, Galgotias University, Greater Noida

Aditya Kumar

Data Science, Ajay Kumar Garg Engineering College, Ghaziabad

References

Aman, & Chhillar, R. S. (2021). Analyzing predictive algorithms in data mining for cardiovascular disease using WEKA tool. International Journal of Advanced Computer Science and Applications, 12(8), 144–150.

Aman, & Chhillar, R. S. (2022). Analyzing three predictive algorithms for diabetes mellitus against the Pima Indians dataset. ECS Transactions, 107(1), 2697.

Aman, & Chhillar, R. S. (2023). Optimized stacking ensemble for early-stage diabetes mellitus prediction. International Journal of Electrical and Computer Engineering, 13(6).

Aman, & Chhillar, R. S. (2024). A stacking-based hybrid model with random forest as meta-learner for diabetes mellitus prediction. International Journal of Machine Learning, 14(2), 54–58.

Aman, Chhillar, R. S., & Chhillar, U. (2023). Disease prediction in healthcare: An ensemble learning perspective.

Aman, Chhillar, R. S., & Chhillar, U. (2024). Machine learning in the battle against COVID-19: Predictive models and future directions. Future Computing Technologies for Sustainable Development (NCFCTSD-24).

Aman, Chhillar, R. S., & Chhillar, U. (2025). Machine learning and chronic kidney disease: Towards early prediction and diagnosis. Emerging Trends in Engineering, Commerce, Management and Hospitality Management in the Digital Age for a Sustainable Future.

Darolia, A., Chhillar, R. S., Alhussein, M., Dalal, S., Aurangzeb, K., & Lilhore, U. K. (2024). Enhanced cardiovascular disease prediction through self-improved Aquila optimized feature selection in quantum neural network and LSTM model. Frontiers in Medicine, 11, 1414637.

Aman, C. R. (2020). Disease predictive models for healthcare by using data mining techniques: State of the art. SSRG International Journal of Engineering Trends and Technology, 68(10). Available: https://www.researchgate.net/profile/Aman-Darolia/publication/345397957_Disease_Predictive_Models_for_Healthcare_by_using_Data_Mining_Techniques_State_of_the_Art/links/63b599fa03aad5368e64aa42/Disease-Predictive-Models-for-Healthcare-by-using-Data-Mining-Techniques-State-of-the-Art.pdf

Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140.

Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1), 119–139.

Dietterich, T. G. (2000). Ensemble methods in machine learning. Multiple Classifier Systems.

Kuncheva, L. I. (2004). Combining pattern classifiers: Methods and algorithms. Wiley.

Published

2025-10-27

How to Cite

Bajaj, D., & Kumar, A. (2025). Machine Learning-Based Diabetes Prediction Using Ensemble Methods. International Journal of Unified Research & Development (IJURD), 1(2). https://doi.org/10.5281/ijurd.v1i2.64

Similar Articles

<< < 1 2 3 4 5 6 7 > >> 

You may also start an advanced similarity search for this article.