Federated, Explainable and Imbalance-Aware Framework for Early Disease-Risk Prediction
DOI:
https://doi.org/10.5281/ijurd.v1i4.9Keywords:
early disease prediction, explainable AI, federated learning, class imbalance, multi-modal fusionAbstract
Early detection of chronic non-communicable diseases can reduce mortality and cost. We present an end-to-end pipeline that (1) fuses longitudinal electronic health records, laboratory results and behavioural variables via attention-based multi-modal learning, (2) trains in a federated, privacy-preserving manner with differential privacy guarantees, (3) handles severe class-imbalance through cost-sensitive learning and Borderline-SMOTE, and (4) delivers clinician-oriented explanations using SHAP values. Experiments on two Indian hospital cohorts (29 733 patients) show an absolute gain of +6.8% AUROC and +7.4% sensitivity over the best centralised baseline while maintaining 90% specificity. The system generalises across institutions and is suitable for resource-constrained settings.References
Aman & Chhillar, R. S. (2021). Analyzing predictive algorithms in data mining for cardiovascular disease using WEKA tool. International Journal of Advanced Computer Science and Applications, 12(8), 144–150.
Aman & Chhillar, R. S. (2022). Analyzing three predictive algorithms for diabetes mellitus against the Pima Indians dataset. ECS Transactions, 107(1), 2697.
Aman & Chhillar, R. S. (2023). Optimized stacking ensemble for early-stage diabetes mellitus prediction. International Journal of Electrical and Computer Engineering, 13(6).
Aman & Chhillar, R. S. (2024). A stacking-based hybrid model with random forest as meta-learner for diabetes mellitus prediction. International Journal of Machine Learning, 14(2), 54–58.
Aman, Chhillar, R. S., & Chhillar, U. (2023). Disease prediction in healthcare: An ensemble learning perspective.
Aman, Chhillar, R. S., & Chhillar, U. (2024). Machine learning in the battle against COVID-19: Predictive models and future directions. Future Computing Technologies for Sustainable Development (NCFCTSD-24).
Aman, Chhillar, R. S., & Chhillar, U. (2025). Machine learning and chronic kidney disease: Towards early prediction and diagnosis. Emerging Trends in Engineering, Commerce, Management and Hospitality Management in the Digital Age for a Sustainable Future.
Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321–357.
Darolia, A., Chhillar, R. S., Alhussein, M., Dalal, S., Aurangzeb, K., & Lilhore, U. K. (2024). Enhanced cardiovascular disease prediction through self-improved Aquila optimized feature selection in quantum neural network and LSTM model. Frontiers in Medicine, 11, 1414637.
Esteva, A., et al. (2019). A guide to deep learning in healthcare. Nature Medicine, 25(1), 24–29.
Global health estimates and disease burden reports (Technical report). (2023). World Health Organization.
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT Press.
Hastie, T., Tibshirani, R., & Friedman, J. (2017). The elements of statistical learning: Data mining, inference, and prediction (2nd ed.). Springer.
Miotto, R., Wang, F., Wang, S., Jiang, X., & Dudley, J. T. (2018). Deep learning for healthcare: Review, opportunities and challenges. Briefings in Bioinformatics, 19(6), 1236–1246.
Obermeyer, Z., & Emanuel, E. J. (2016). Predicting the future—Big data, machine learning, and clinical medicine. New England Journal of Medicine, 375(13), 1216–1219.
Rajkomar, A., Dean, J., & Kohane, I. (2018). Machine learning in medicine. New England Journal of Medicine, 378(14), 1347–1358.
Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). Why should I trust you? Explaining the predictions of any classifier. Proceedings of the 22nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 1135–1144.
Rudin, C. (2019). Stop explaining black-box machine-learning models for high-stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1, 206–215.
Shickel, B., Tighe, P. J., Bihorac, A., & Rashidi, P. (2018). Deep EHR: A survey of recent advances in deep learning techniques for electronic health record analysis. IEEE Journal of Biomedical and Health Informatics, 22(5), 1589–1604.
Xiao, Y., Wu, J., Lin, Z., & Zhao, X. (2018). A deep-learning-based multi-model ensemble method for cancer prediction. Computer Methods and Programs in Biomedicine, 153, 1–9.
Zhang, J., et al. (2021). Explainable artificial intelligence for healthcare: A survey. IEEE Access, 9, 11415–11430.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Aayush Chaudhary

This work is licensed under a Creative Commons Attribution 4.0 International License.