Speech-Based Mental Health Detection Using Deep Learning
DOI:
https://doi.org/10.5281/ijurd.v1i1.78Keywords:
Mental Health, Speech Analysis, Deep Learning, Depression Detection, Audio ProcessingAbstract
Mental health disorders are increasingly prevalent and often remain undiagnosed due to social stigma and lack of accessible diagnostic tools. This paper presents a Speech-Based Mental Health Detection system using deep learning techniques to identify psychological conditions such as depression, anxiety, and stress from voice signals. The proposed framework utilizes acoustic and prosodic features, including pitch, tone, energy, and speech patterns, to capture emotional and cognitive states. Deep learning models such as Convolutional Neural Networks and Recurrent Neural Networks are employed to learn complex temporal and spectral patterns in speech data. The system is further enhanced by integrating hybrid learning strategies to improve classification accuracy and robustness. Experimental results demonstrate that the proposed approach achieves reliable performance in detecting mental health conditions and offers a non-invasive, cost-effective solution for early diagnosis. Additionally, the integration of prior research in disease prediction and ensemble learning contributes to improved model generalization. The study highlights the potential of speech-based analysis as an effective tool for continuous mental health monitoring, particularly in remote and resource-limited settings.
References
Aman, & Chhillar, R. S. (2021). Analyzing predictive algorithms in data mining for cardiovascular disease using WEKA tool. International Journal of Advanced Computer Science and Applications, 12(8), 144–150.
Aman, & Chhillar, R. S. (2022). Analyzing three predictive algorithms for diabetes mellitus against the Pima Indians dataset. ECS Transactions, 107(1), 2697.
Aman, & Chhillar, R. S. (2023). Optimized stacking ensemble for early-stage diabetes mellitus prediction. International Journal of Electrical and Computer Engineering, 13(6).
Aman, & Chhillar, R. S. (2024). A stacking-based hybrid model with random forest as meta-learner for diabetes mellitus prediction. International Journal of Machine Learning, 14(2), 54–58.
Aman, Chhillar, R. S., & Chhillar, U. (2023). Disease prediction in healthcare: An ensemble learning perspective.
Aman, Chhillar, R. S., & Chhillar, U. (2024). Machine learning in the battle against COVID-19: Predictive models and future directions. Future Computing Technologies for Sustainable Development (NCFCTSD-24).
Aman, Chhillar, R. S., & Chhillar, U. (2025). Machine learning and chronic kidney disease: Towards early prediction and diagnosis. Emerging Trends in Engineering, Commerce, Management and Hospitality Management in the Digital Age for a Sustainable Future.
Darolia, A., Chhillar, R. S., Alhussein, M., Dalal, S., Aurangzeb, K., & Lilhore, U. K. (2024). Enhanced cardiovascular disease prediction through self-improved Aquila optimized feature selection in quantum neural network and LSTM model. Frontiers in Medicine, 11, 1414637.
Aman, C. R. (2020). Disease predictive models for healthcare by using data mining techniques: State of the art. SSRG International Journal of Engineering Trends and Technology, 68(10). Available: https://www.researchgate.net/profile/Aman-Darolia/publication/345397957_Disease_Predictive_Models_for_Healthcare_by_using_Data_Mining_Techniques_State_of_the_Art/links/63b599fa03aad5368e64aa42/Disease-Predictive-Models-for-Healthcare-by-using-Data-Mining-Techniques-State-of-the-Art.pdf
Cummins, N., Scherer, S., Krajewski, J., et al. (2015). A review of depression and suicide risk assessment using speech analysis. Speech Communication, 71, 10–49.
Alhanai, T., Ghassemi, M., & Glass, J. (2018). Detecting depression with audio/text sequence modeling of interviews. Proceedings of Interspeech.
Low, L. A., Maddage, N. C., Lech, M., et al. (2010). Influence of acoustic low-level descriptors in the detection of clinical depression in adolescents. Proceedings of ICASSP.
Tzirakis, P., Trigeorgis, G., Nicolaou, M. A., et al. (2017). End-to-end multimodal emotion recognition using deep neural networks. IEEE Journal of Selected Topics in Signal Processing.
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Ritu Nair, Dinesh Dhillon, Seema Bhatia, Pooja Ghosh

This work is licensed under a Creative Commons Attribution 4.0 International License.