Federated, Explainable and Imbalance-Aware Framework for Early Disease-Risk Prediction

Authors

  • Aayush Chaudhary Galgotias College of Engineering, Greater Noida

DOI:

https://doi.org/10.5281/ijurd.v1i4.9

Keywords:

early disease prediction, explainable AI, federated learning, class imbalance, multi-modal fusion

Abstract

Early detection of chronic non-communicable diseases can reduce mortality and cost. We present an end-to-end pipeline that (1) fuses longitudinal electronic health records, laboratory results and behavioural variables via attention-based multi-modal learning, (2) trains in a federated, privacy-preserving manner with differential privacy guarantees, (3) handles severe class-imbalance through cost-sensitive learning and Borderline-SMOTE, and (4) delivers clinician-oriented explanations using SHAP values. Experiments on two Indian hospital cohorts (29 733 patients) show an absolute gain of +6.8% AUROC and +7.4% sensitivity over the best centralised baseline while maintaining 90% specificity. The system generalises across institutions and is suitable for resource-constrained settings.

References

Aman & Chhillar, R. S. (2021). Analyzing predictive algorithms in data mining for cardiovascular disease using WEKA tool. International Journal of Advanced Computer Science and Applications, 12(8), 144–150.

Aman & Chhillar, R. S. (2022). Analyzing three predictive algorithms for diabetes mellitus against the Pima Indians dataset. ECS Transactions, 107(1), 2697.

Aman & Chhillar, R. S. (2023). Optimized stacking ensemble for early-stage diabetes mellitus prediction. International Journal of Electrical and Computer Engineering, 13(6).

Aman & Chhillar, R. S. (2024). A stacking-based hybrid model with random forest as meta-learner for diabetes mellitus prediction. International Journal of Machine Learning, 14(2), 54–58.

Aman, Chhillar, R. S., & Chhillar, U. (2023). Disease prediction in healthcare: An ensemble learning perspective.

Aman, Chhillar, R. S., & Chhillar, U. (2024). Machine learning in the battle against COVID-19: Predictive models and future directions. Future Computing Technologies for Sustainable Development (NCFCTSD-24).

Aman, Chhillar, R. S., & Chhillar, U. (2025). Machine learning and chronic kidney disease: Towards early prediction and diagnosis. Emerging Trends in Engineering, Commerce, Management and Hospitality Management in the Digital Age for a Sustainable Future.

Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321–357.

Darolia, A., Chhillar, R. S., Alhussein, M., Dalal, S., Aurangzeb, K., & Lilhore, U. K. (2024). Enhanced cardiovascular disease prediction through self-improved Aquila optimized feature selection in quantum neural network and LSTM model. Frontiers in Medicine, 11, 1414637.

Esteva, A., et al. (2019). A guide to deep learning in healthcare. Nature Medicine, 25(1), 24–29.

Global health estimates and disease burden reports (Technical report). (2023). World Health Organization.

Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT Press.

Hastie, T., Tibshirani, R., & Friedman, J. (2017). The elements of statistical learning: Data mining, inference, and prediction (2nd ed.). Springer.

Miotto, R., Wang, F., Wang, S., Jiang, X., & Dudley, J. T. (2018). Deep learning for healthcare: Review, opportunities and challenges. Briefings in Bioinformatics, 19(6), 1236–1246.

Obermeyer, Z., & Emanuel, E. J. (2016). Predicting the future—Big data, machine learning, and clinical medicine. New England Journal of Medicine, 375(13), 1216–1219.

Rajkomar, A., Dean, J., & Kohane, I. (2018). Machine learning in medicine. New England Journal of Medicine, 378(14), 1347–1358.

Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). Why should I trust you? Explaining the predictions of any classifier. Proceedings of the 22nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 1135–1144.

Rudin, C. (2019). Stop explaining black-box machine-learning models for high-stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1, 206–215.

Shickel, B., Tighe, P. J., Bihorac, A., & Rashidi, P. (2018). Deep EHR: A survey of recent advances in deep learning techniques for electronic health record analysis. IEEE Journal of Biomedical and Health Informatics, 22(5), 1589–1604.

Xiao, Y., Wu, J., Lin, Z., & Zhao, X. (2018). A deep-learning-based multi-model ensemble method for cancer prediction. Computer Methods and Programs in Biomedicine, 153, 1–9.

Zhang, J., et al. (2021). Explainable artificial intelligence for healthcare: A survey. IEEE Access, 9, 11415–11430.

Downloads

Published

2025-12-31

How to Cite

Aayush Chaudhary. (2025). Federated, Explainable and Imbalance-Aware Framework for Early Disease-Risk Prediction. International Journal of Unified Research & Development (IJURD), 1(4). https://doi.org/10.5281/ijurd.v1i4.9