Speech-Based Mental Health Detection Using Deep Learning
DOI:
https://doi.org/10.5281/ijurd.v1i4.48Keywords:
Mental Health, Speech Analysis, Deep Learning, Depression Detection, Audio ProcessingAbstract
Mental health disorders are increasingly prevalent and often remain undiagnosed due to social stigma and limited access to early diagnostic tools. This paper presents a Speech-Based Mental Health Detection system using deep learning techniques to identify psychological conditions such as depression, anxiety, and stress from voice signals. The proposed framework extracts acoustic and prosodic features, including pitch, tone, intensity, and speech rhythm, to capture emotional and cognitive patterns. Deep learning models such as Convolutional Neural Networks and Recurrent Neural Networks are employed to learn complex temporal and spectral representations from speech data. The system is further enhanced through hybrid learning strategies to improve classification accuracy and robustness. Experimental results indicate that the proposed approach achieves reliable performance in detecting mental health conditions and offers a non-invasive and cost-effective solution for early screening. Additionally, integration with prior research in disease prediction and ensemble learning improves generalization capability. The study demonstrates that speech-based analysis can serve as an effective tool for continuous mental health monitoring, particularly in remote and resource-constrained environments.
References
Aman, & Chhillar, R. S. (2021). Analyzing predictive algorithms in data mining for cardiovascular disease using WEKA tool. International Journal of Advanced Computer Science and Applications, 12(8), 144–150.
Aman, & Chhillar, R. S. (2022). Analyzing three predictive algorithms for diabetes mellitus against the Pima Indians dataset. ECS Transactions, 107(1), 2697.
Aman, & Chhillar, R. S. (2023). Optimized stacking ensemble for early-stage diabetes mellitus prediction. International Journal of Electrical and Computer Engineering, 13(6).
Aman, & Chhillar, R. S. (2024). A stacking-based hybrid model with random forest as meta-learner for diabetes mellitus prediction. International Journal of Machine Learning, 14(2), 54–58.
Aman, Chhillar, R. S., & Chhillar, U. (2023). Disease prediction in healthcare: An ensemble learning perspective.
Aman, Chhillar, R. S., & Chhillar, U. (2024). Machine learning in the battle against COVID-19: Predictive models and future directions. Future Computing Technologies for Sustainable Development (NCFCTSD-24).
Aman, Chhillar, R. S., & Chhillar, U. (2025). Machine learning and chronic kidney disease: Towards early prediction and diagnosis. Emerging Trends in Engineering, Commerce, Management and Hospitality Management in the Digital Age for a Sustainable Future.
Darolia, A., Chhillar, R. S., Alhussein, M., Dalal, S., Aurangzeb, K., & Lilhore, U. K. (2024). Enhanced cardiovascular disease prediction through self-improved Aquila optimized feature selection in quantum neural network and LSTM model. Frontiers in Medicine, 11, 1414637.
Aman, C. R. (2020). Disease predictive models for healthcare by using data mining techniques: State of the art. SSRG International Journal of Engineering Trends and Technology, 68(10). Available: https://www.researchgate.net/profile/Aman-Darolia/publication/345397957_Disease_Predictive_Models_for_Healthcare_by_using_Data_Mining_Techniques_State_of_the_Art/links/63b599fa03aad5368e64aa42/Disease-Predictive-Models-for-Healthcare-by-using-Data-Mining-Techniques-State-of-the-Art.pdf
Cummins, N., Scherer, S., Krajewski, J., et al. (2015). A review of depression and suicide risk assessment using speech analysis. Speech Communication, 71, 10–49.
Alhanai, T., Ghassemi, M., & Glass, J. (2018). Detecting depression with audio/text sequence modeling of interviews. Proceedings of Interspeech.
Low, L. A., Maddage, N. C., Lech, M., et al. (2010). Influence of acoustic low-level descriptors in the detection of clinical depression in adolescents. Proceedings of ICASSP.
Tzirakis, P., Trigeorgis, G., Nicolaou, M. A., et al. (2017). End-to-end multimodal emotion recognition using deep neural networks. IEEE Journal of Selected Topics in Signal Processing.
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Karthik Yadav, Vijay Sidhu

This work is licensed under a Creative Commons Attribution 4.0 International License.