Can AI recognize the signs of depression in people’s voices?

A machine learning tool successfully identified vocal markers of depression in over 70% of cases within 25 seconds, highlighting its potential for improving mental health screening in primary care and virtual healthcare settings.

Man, stress and office for employee burnout, tired finance consultant and computer mistake or fail.Study: Evaluation of an AI-Based Voice Biomarker Tool to Detect Signals Consistent With Moderate to Severe Depression. Image Credit: PeopleImages.com - Yuri A/Shutterstock.com

In a recent article in The Annals of Family Medicine, researchers evaluated the effectiveness of a machine learning (ML) tool for detecting vocal signs linked to severe or moderate depression.

The tool successfully detected vocal markers of depression in just 25 seconds, correctly identifying cases of depression in more than 70% of samples, highlighting its utility for mental health screening.

Background

Depression is a major health issue, affecting about 18 million Americans annually, with nearly 30% experiencing it at some point in their lives.

Despite guidelines recommending universal screening, depression screening in primary care remains very low (<4%), and even when screening is recommended, fewer than 50% of eligible patients are tested.

ML has the potential to improve screening rates without adding extra administrative work. People experiencing depression often have distinct speech patterns, including stuttering, hesitations, longer pauses, and slower speech. ML can analyze these vocal traits, known as voice biomarkers, to detect signs of depression.

Using ML for voice-based depression screening offers a noninvasive, objective, and automated way to identify at-risk individuals, particularly in virtual healthcare settings.

This approach could make screening more accessible and efficient, ultimately helping clinicians detect depression earlier and improve patient care.

About the study

Researchers explored whether ML could detect signs of depression by analyzing speech patterns. They studied 14,898 adults recruited through social media from the U.S. and Canada. To ensure a diverse group, they specifically targeted men and older adults in their outreach.

Participants completed a standard depression questionnaire and recorded at least 25 seconds of speech using their phones or computers. Researchers processed the recordings to ensure clear and consistent audio quality.

The ML model analyzed the voice recordings to determine if someone might have moderate to severe depression.

It sorted participants into three categories, identifying them as being likely to have depression if their voice patterns strongly suggested so, having no signs of depression if no clear vocal markers were found, and recommending further evaluation if results were unclear.

To check accuracy, researchers compared the ML model’s predictions with participants’ actual questionnaire results. They also fine-tuned the system to reduce errors.

Findings

The study analyzed voice recordings from 14,898 participants, splitting them into two groups: 10,442 for training and the remaining 4,456 for validation. Participants' speech samples ranged from 25 to slightly under 75 seconds, with an average of about 58 seconds. Their self-reported depression scores ranged from 0 to 27, with a median of 9.

The ML model categorized participants as having markers of depression or no markers of depression across 3,536 validation samples.

It achieved a 71.3% sensitivity (ability to detect depression) and a specificity (ability to rule out depression) of 73.5%. About 20% of cases (920 samples) were classified as uncertain, requiring further evaluation.

The model performed differently across demographic groups. It detected depression most accurately in Hispanic/Latine (80.3%) and Black/African American (72.4%) participants. Specificity was highest for Asian/Pacific Islander (77.5%) and Black/African American (75.9%) groups.

Women had higher sensitivity (74%) but lower specificity (68.9%), while men had lower sensitivity (59.3%) but higher specificity (83.9%). Younger participants (under 60) had more consistent results than older participants (60 and above), whose sensitivity was 63.4% but specificity was 86.8%.

Overall, the ML model showed promise for depression screening, though accuracy varied by age, gender, and ethnicity.

Conclusions

This study explored the potential of ML for detecting vocal patterns associated with moderate to severe depression. The ML model analyzed short speech samples and performed similarly to established screening tools, with a sensitivity of 71.3% and specificity of 73.5%.

While not a replacement for clinical diagnosis, this technology could help primary care doctors screen more patients efficiently. Similar ML tools have been applied to detect neurological conditions, highlighting their potential in healthcare.

One challenge is balancing false negatives and false positives, which can be modified depending on clinical needs. The model performed less accurately for men, possibly due to their lower representation in training data and differences in depression symptoms.

Older adults also had lower sensitivity but higher specificity, suggesting that age-related voice changes might influence results.

The study had diverse participants across the U.S. and Canada, but more research is needed to understand how comorbid conditions impact voice biomarkers. Future studies should also refine the model for better accuracy across different populations.

While still in development, ML-based voice analysis could support universal depression screening, helping clinicians detect depression earlier and reduce diagnostic bias.

Overall, the study suggests that ML-based voice analysis could be a useful tool for depression screening, making it easier for doctors to identify those in need. However, more research is necessary before it can be widely used.

Journal reference:
Dr. Chinta Sidharthan

Written by

Dr. Chinta Sidharthan

Chinta Sidharthan is a writer based in Bangalore, India. Her academic background is in evolutionary biology and genetics, and she has extensive experience in scientific research, teaching, science writing, and herpetology. Chinta holds a Ph.D. in evolutionary biology from the Indian Institute of Science and is passionate about science education, writing, animals, wildlife, and conservation. For her doctoral research, she explored the origins and diversification of blindsnakes in India, as a part of which she did extensive fieldwork in the jungles of southern India. She has received the Canadian Governor General’s bronze medal and Bangalore University gold medal for academic excellence and published her research in high-impact journals.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Sidharthan, Chinta. (2025, January 30). Can AI recognize the signs of depression in people’s voices?. News-Medical. Retrieved on January 30, 2025 from https://www.news-medical.net/news/20250130/Can-AI-recognize-the-signs-of-depression-in-peoplee28099s-voices.aspx.

  • MLA

    Sidharthan, Chinta. "Can AI recognize the signs of depression in people’s voices?". News-Medical. 30 January 2025. <https://www.news-medical.net/news/20250130/Can-AI-recognize-the-signs-of-depression-in-peoplee28099s-voices.aspx>.

  • Chicago

    Sidharthan, Chinta. "Can AI recognize the signs of depression in people’s voices?". News-Medical. https://www.news-medical.net/news/20250130/Can-AI-recognize-the-signs-of-depression-in-peoplee28099s-voices.aspx. (accessed January 30, 2025).

  • Harvard

    Sidharthan, Chinta. 2025. Can AI recognize the signs of depression in people’s voices?. News-Medical, viewed 30 January 2025, https://www.news-medical.net/news/20250130/Can-AI-recognize-the-signs-of-depression-in-peoplee28099s-voices.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of News Medical.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
Research highlights depression trends in community-dwelling older adults