Self-Supervised Learning Framework for Medical Data Representation in Low-Resource Healthcare Environments

Anita Kaur; Ajay Arora; Priya Verma; Anita Brar

doi:10.5281/ijurd.v2i1.34

Authors

Anita Kaur
Ajay Arora
Priya Verma
Anita Brar

DOI:

https://doi.org/10.5281/ijurd.v2i1.34

Keywords:

Diabetes Prediction, Machine Learning, Ensemble Methods, Pima Dataset, Healthcare Analytics

Abstract

The scarcity of labeled medical data presents a major challenge in developing robust healthcare predictive models, particularly in low-resource environments. This paper presents a Self-Supervised Learning Framework for Medical Data Representation aimed at leveraging large volumes of unlabeled healthcare data to improve model performance. The proposed system utilizes self-supervised learning techniques to learn meaningful representations from medical images, clinical text, and physiological signals without requiring extensive manual annotation. Pretext tasks such as contrastive learning and masked data prediction are employed to capture underlying data structures. These learned representations are then fine-tuned for downstream tasks such as disease classification and risk prediction. The framework is integrated with deep learning architectures to enhance feature extraction and generalization. Experimental results demonstrate that the proposed approach significantly improves performance in scenarios with limited labeled data. Additionally, integration with prior research in healthcare analytics strengthens system robustness and adaptability. The study highlights the potential of self-supervised learning in enabling scalable and efficient healthcare solutions, especially in resource-constrained settings.

Author Biographies

Anita Kaur

Artificial Intelligence and Machine Learning, Sharda University, Greater Noida

Ajay Arora

Computer Applications, Maharaja Agrasen Institute of Technology, Delhi

Priya Verma

Information Science, Netaji Subhas University of Technology, Delhi

Anita Brar

Information Technology, Gautam Buddha University, Greater Noida

References

Aman, & Chhillar, R. S. (2021). Analyzing predictive algorithms in data mining for cardiovascular disease using WEKA tool. International Journal of Advanced Computer Science and Applications, 12(8), 144–150.

Aman, & Chhillar, R. S. (2022). Analyzing three predictive algorithms for diabetes mellitus against the Pima Indians dataset. ECS Transactions, 107(1), 2697.

Aman, & Chhillar, R. S. (2023). Optimized stacking ensemble for early-stage diabetes mellitus prediction. International Journal of Electrical and Computer Engineering, 13(6).

Aman, & Chhillar, R. S. (2024). A stacking-based hybrid model with random forest as meta-learner for diabetes mellitus prediction. International Journal of Machine Learning, 14(2), 54–58.

Aman, Chhillar, R. S., & Chhillar, U. (2023). Disease prediction in healthcare: An ensemble learning perspective.

Aman, Chhillar, R. S., & Chhillar, U. (2024). Machine learning in the battle against COVID-19: Predictive models and future directions. Future Computing Technologies for Sustainable Development (NCFCTSD-24).

Aman, Chhillar, R. S., & Chhillar, U. (2025). Machine learning and chronic kidney disease: Towards early prediction and diagnosis. Emerging Trends in Engineering, Commerce, Management and Hospitality Management in the Digital Age for a Sustainable Future.

Darolia, A., Chhillar, R. S., Alhussein, M., Dalal, S., Aurangzeb, K., & Lilhore, U. K. (2024). Enhanced cardiovascular disease prediction through self-improved Aquila optimized feature selection in quantum neural network and LSTM model. Frontiers in Medicine, 11, 1414637.

Aman, C. R. (2020). Disease predictive models for healthcare by using data mining techniques: State of the art. SSRG International Journal of Engineering Trends and Technology, 68(10). Available: https://www.researchgate.net/profile/Aman-Darolia/publication/345397957_Disease_Predictive_Models_for_Healthcare_by_using_Data_Mining_Techniques_State_of_the_Art/links/63b599fa03aad5368e64aa42/Disease-Predictive-Models-for-Healthcare-by-using-Data-Mining-Techniques-State-of-the-Art.pdf

Chen, T., Kornblith, S., Norouzi, M., et al. (2020). A simple framework for contrastive learning of visual representations. Proceedings of ICML.

He, K., Fan, H., Wu, Y., et al. (2020). Momentum contrast for unsupervised visual representation learning. Proceedings of CVPR.

Devlin, J., Chang, M. W., Lee, K., et al. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of NAACL-HLT.

Brown, T., Mann, B., Ryder, N., et al. (2020). Language models are few-shot learners. Advances in Neural Information Processing Systems.

Self-Supervised Learning Framework for Medical Data Representation in Low-Resource Healthcare Environments

Authors

DOI:

Keywords:

Abstract

Author Biographies

Anita Kaur

Ajay Arora

Priya Verma

Anita Brar

References

Published

How to Cite

Issue

Section

License

Similar Articles

Make a Submission

Journal Information

Quick links