AI Voice Detection and Decipher of Depression from Samples of Audio Data

AI Voice Detection and Decipher of Depression from Samples of Audio Data

Authors

  • Pongpak Manoret Triam Udom Suksa School, Thailand
  • Punnatorn Chotipurk Triam Udom Suksa School, Thailand
  • Sompoom Sunpaweravong Triam Udom Suksa School, Thailand
  • Chanati Jantrachotechatchawan Research Division, Faculty of Medicine Siriraj Hospital, Mahidol University, Thailand
  • Dr. Wong Jest Phia Westwood Clinic – Re-Mind App, Kuala Lumpur, Malaysia

Downloads

Keywords:

depression, automatic detection, deep learning, speech processing

Abstract

Depression is a common mental disorder which has been affecting millions of people around the world and becoming more severe with the arrival of COVID-19. Nevertheless proper diagnosis is not accessible in many regions due to a severe shortage of psychiatrists. This scarcity is worsened in low-income countries which have a psychiatrist to population ratio 210 times lower than that of countries with better economies. This study aimed to explore applications of deep learning in diagnosing depression from voice samples. We collected data from the DAIC-WOZ database which contained 189 vocal recordings from 154 individuals. Voice samples from a patient with a PHQ-8 score equal or higher than 10 were deemed as depressed and those with a PHQ-8 score lower than 10 were considered healthy. We applied mel-spectrogram to extract relevant features from the audio. Three types of encoders were tested i.e. 1D CNN, 1D CNN-LSTM, and 1D CNN-GRU. After tuning hyperparameters systematically, we found that 1D CNN-GRU encoder with a kernel size of 5 and 15 seconds of recording data appeared to have the best performance with F1 score of 0.75, precision of 0.64, and recall of 0.92.

Downloads

Download data is not yet available.

References

Institute of Health Metrics and Evaluation. Global Health Data Exchange (GHDx). Available:

http://ghdx.healthdata.org/gbd-results-tool?params=gbd-api-2019-permalink/d780dffbe8a381b25e1416884959e88b (Cited 2021, October 3).

Worley, H. (2006, June 1). Depression a Leading Contributor to Global Burden of Disease. prb.org. Available:

https://www.prb.org/resources/depression-a-leading-contributor-to-global-burden-of-disease

Williams, S. Z., Chung, G. S., & Muennig, P. A. (2017). Undiagnosed depression: A community diagnosis. SSM - population health, 3, 633-638. Available: https://doi.org/10.1016/j.ssmph.2017.07.012

World Health Organization. (2021, September 13). Depression Key Facts. Available: https://www.who.int/news-room/fact-sheets/detail/depression

Friedrich M. (2017). Depression Is the Leading Cause of Disability Around the World. JAMA, 317(15), 1517. Available: doi:10.1001/jama.2017.3826

Stewart, W. F., Ricci, J. A., Chee, E., Hahn, S. R., & Morganstein, D. (2003). Cost of lost productive work time among US workers with depression. JAMA, 289(23), 3135-3144. Available: https://doi.org/10.1001/jama.289.23.3135

American Psychiatric Association Foundation. (n.d.). Quantifying the Cost of Depression. Workplacementalhealth.Org. Retrieved October 5, 2021, from Available: https://www.workplacementalhealth.org/mental-health-topics/depression/quantifying-the-cost-of-depression

Kroenke, K., Spitzer, R. L., & Williams, J. B. (2003). The Patient Health Questionnaire-2: validity of a two-item depression screener. Medical care, 41(11), 1284-1292. Available: https://doi.org/10.1097/01.MLR.0000093487.78664.3C

Kroenke, K., Strine, T. W., Spitzer, R. L., Williams, J. B., Berry, J. T., & Mokdad, A. H. (2009). The PHQ-8 as a measure of current depression in the general population. Journal of affective disorders, 114(1-3), 163-173. Available: https://doi.org/10.1016/j.jad.2008.06.026

Costantini, L., Pasquarella, C., Odone, A., Colucci, M. E., Costanza, A., Serafini, G., Aguglia, A., Belvederi Murri, M., Brakoulias, V., Amore, M., Ghaemi, S. N., & Amerio, A. (2021). Screening for depression in primary care with Patient Health Questionnaire-9. Journal of affective disorders, 279, 473-483. Available: https://doi.org/10.1016/j.jad.2020.09.131

Leonhardt, M. (2021, May 10). What you need to know about the cost and accessibility of mental health care in America. CNBC. Available: https://www.cnbc.com/2021/05/10/cost-and-accessibility-of-mental-health-care-in-america.html

Mcginty, B. (2020, July 9). Medicare's Mental Health Coverage: How COVID-19 Highlights Gaps and Opportunities for Improvement. Issue Brief, The Commonwealth Fund. Available: https://www.commonwealthfund.org/publications/issue-briefs/2020/jul/medicare-mental-health-coverage-covid-19-gaps-opportunities

Mental Health Care Health Professional Shortage Areas (HPSAs). (2020, November 5). KFF. Available: https://www.kff.org/other/state-indicator/mental-health-care-health-professional-shortage-areas-hpsas/?currentTimeframe=0&sortModel=%7B%22colId%22:%22Location%22,%22sort%22:%22asc%22%7D

Smith, M. V., Gotman, N., Lin, H., & Yonkers, K. A. (2010). Do the PHQ-8 and the PHQ-2 accurately screen for depressive disorders in a sample of pregnant women?. General hospital psychiatry, 32(5), 544-548. Available: https://doi.org/10.1016/j.genhosppsych.2010.04.011

Pettersson, A., Boström, K. B., Gustavsson, P., & Ekselius, L. (2015). Which instruments to support diagnosis of depression have sufficient accuracy? A systematic review. Nordic journal of psychiatry, 69(7), 497-508. Available: https://doi.org/10.3109/08039488.2015.1008568

Smith G. A. (1977). Voice analysis for the measurement of anxiety. The British journal of medical psychology, 50(4), 367-373. Available: https://doi.org/10.1111/j.2044-8341.1977.tb02435.x

Xu, R., Mei, G., Zhang, G., Gao, P., Judkins, T., Cannizzaro, M., & Li, J. (2012). A voice-based automated system for PTSD screening and monitoring. Studies in health technology and informatics, 173, 552-558.

Martínez-Sánchez, F., Meilán, J., Carro, J., & Ivanova, O. (2018). A Prototype for the Voice Analysis Diagnosis of Alzheimer's Disease. Journal of Alzheimer's disease : JAD, 64(2), 473-481. Available: https://doi.org/10.3233/JAD-180037

Yamamoto, M., Takamiya, A., Sawada, K., Yoshimura, M., Kitazawa, M., Liang, K. C., Fujita, T., Mimura, M., & Kishimoto, T. (2020). Using speech recognition technology to investigate the association between timing-related speech features and depression severity. PloS one, 15(9), e0238726. Available: https://doi.org/10.1371/journal.pone.0238726

Graham, S., Depp, C., Lee, E. E., Nebeker, C., Tu, X., Kim, H. C., Jeste, D. V., & Yamada, Y. (2019). Artificial Intelligence for Mental Health and Mental Illnesses: an Overview. Current psychiatry reports, 21(11), 116. Available: https://doi.org/10.1007/s11920-019-1097-2

Downloads

Published

15-03-2024

How to Cite

Manoret, P., P. Chotipurk, S. Sunpaweravong, C. Jantrachotechatchawan, and D. W. Jest Phia. “AI Voice Detection and Decipher of Depression from Samples of Audio Data”. Journal of Science & Technology, vol. 5, no. 2, Mar. 2024, pp. 1-33, https://thesciencebrigade.com/jst/article/view/141.
PlumX Metrics

Plaudit

License Terms

Ownership and Licensing:

Authors of this research paper submitted to the Journal of Science & Technology retain the copyright of their work while granting the journal certain rights. Authors maintain ownership of the copyright and have granted the journal a right of first publication. Simultaneously, authors agreed to license their research papers under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) License.

License Permissions:

Under the CC BY-NC-SA 4.0 License, others are permitted to share and adapt the work, as long as proper attribution is given to the authors and acknowledgement is made of the initial publication in the Journal of Science & Technology. This license allows for the broad dissemination and utilization of research papers.

Additional Distribution Arrangements:

Authors are free to enter into separate contractual arrangements for the non-exclusive distribution of the journal's published version of the work. This may include posting the work to institutional repositories, publishing it in journals or books, or other forms of dissemination. In such cases, authors are requested to acknowledge the initial publication of the work in the Journal of Science & Technology.

Online Posting:

Authors are encouraged to share their work online, including in institutional repositories, disciplinary repositories, or on their personal websites. This permission applies both prior to and during the submission process to the Journal of Science & Technology. Online sharing enhances the visibility and accessibility of the research papers.

Responsibility and Liability:

Authors are responsible for ensuring that their research papers do not infringe upon the copyright, privacy, or other rights of any third party. The Journal of Science & Technology and The Science Brigade Publishers disclaim any liability or responsibility for any copyright infringement or violation of third-party rights in the research papers.

Loading...