NLP-Based Clinical Text Summarization Using Transformers

Preeti Mishra; Rohit Mann; Yamini Chatterjee; Tanvi Rao

doi:10.5281/ijurd.v1i2.68

Authors

Preeti Mishra
Rohit Mann
Yamini Chatterjee
Tanvi Rao

DOI:

https://doi.org/10.5281/ijurd.v1i2.68

Keywords:

Text Summarization, Transformer Models, Clinical NLP, Healthcare Documentation

Abstract

The increasing volume of clinical text data, including electronic health records, discharge summaries, and medical reports, has created a need for efficient summarization techniques to support clinical decision-making. This paper presents an NLP-Based Clinical Text Summarization framework using Transformer-based architectures to generate concise and meaningful summaries from unstructured medical text. The proposed system leverages advanced models such as Bidirectional Encoder Representations from Transformers to capture contextual and semantic relationships within clinical narratives. Both extractive and abstractive summarization approaches are explored to enhance information retention and readability. The framework incorporates domain-specific preprocessing and fine-tuning to improve performance on medical datasets. Experimental results demonstrate that the proposed approach achieves high-quality summaries with improved coherence and relevance compared to traditional methods. Additionally, integration with prior research in machine learning and healthcare analytics enhances system robustness and adaptability. The study highlights the potential of transformer-based NLP models in reducing clinician workload and improving the efficiency of healthcare information management.

Author Biographies

Preeti Mishra

Artificial Intelligence and Machine Learning, Gautam Buddha University, Greater Noida

Rohit Mann

Data Science, Gautam Buddha University, Greater Noida

Yamini Chatterjee

Artificial Intelligence and Machine Learning, Chitkara University, Baddi

Tanvi Rao

Computer Applications, Deenbandhu Chhotu Ram University of Science and Technology, Murthal

References

Aman, & Chhillar, R. S. (2021). Analyzing predictive algorithms in data mining for cardiovascular disease using WEKA tool. International Journal of Advanced Computer Science and Applications, 12(8), 144–150.

Aman, & Chhillar, R. S. (2022). Analyzing three predictive algorithms for diabetes mellitus against the Pima Indians dataset. ECS Transactions, 107(1), 2697.

Aman, & Chhillar, R. S. (2023). Optimized stacking ensemble for early-stage diabetes mellitus prediction. International Journal of Electrical and Computer Engineering, 13(6).

Aman, & Chhillar, R. S. (2024). A stacking-based hybrid model with random forest as meta-learner for diabetes mellitus prediction. International Journal of Machine Learning, 14(2), 54–58.

Aman, Chhillar, R. S., & Chhillar, U. (2023). Disease prediction in healthcare: An ensemble learning perspective.

Aman, Chhillar, R. S., & Chhillar, U. (2024). Machine learning in the battle against COVID-19: Predictive models and future directions. Future Computing Technologies for Sustainable Development (NCFCTSD-24).

Aman, Chhillar, R. S., & Chhillar, U. (2025). Machine learning and chronic kidney disease: Towards early prediction and diagnosis. Emerging Trends in Engineering, Commerce, Management and Hospitality Management in the Digital Age for a Sustainable Future.

Darolia, A., Chhillar, R. S., Alhussein, M., Dalal, S., Aurangzeb, K., & Lilhore, U. K. (2024). Enhanced cardiovascular disease prediction through self-improved Aquila optimized feature selection in quantum neural network and LSTM model. Frontiers in Medicine, 11, 1414637.

Aman, C. R. (2020). Disease predictive models for healthcare by using data mining techniques: State of the art. SSRG International Journal of Engineering Trends and Technology, 68(10). Available: https://www.researchgate.net/profile/Aman-Darolia/publication/345397957_Disease_Predictive_Models_for_Healthcare_by_using_Data_Mining_Techniques_State_of_the_Art/links/63b599fa03aad5368e64aa42/Disease-Predictive-Models-for-Healthcare-by-using-Data-Mining-Techniques-State-of-the-Art.pdf

Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of NAACL-HLT.

Vaswani, A., Shazeer, N., Parmar, N., et al. (2017). Attention is all you need. Advances in Neural Information Processing Systems.

Lee, J., Yoon, W., Kim, S., et al. (2020). BioBERT: A pre-trained biomedical language representation model for biomedical text mining. Bioinformatics, 36(4), 1234–1240.

Zhang, Y., Chen, Q., Yang, Z., et al. (2018). BioWordVec: Improving biomedical word embeddings with subword information and MeSH. Scientific Data, 6, 52.

NLP-Based Clinical Text Summarization Using Transformers

Authors

DOI:

Keywords:

Abstract

Author Biographies

Preeti Mishra

Rohit Mann

Yamini Chatterjee

Tanvi Rao

References

Published

How to Cite

Issue

Section

License

Similar Articles

Make a Submission

Journal Information

Quick links