Background and goal: Depression impacts an estimated 18 million Americans each year, yet depression screening rarely occurs in the outpatient setting. This study evaluated an AI-based machine learning biomarker tool that uses speech patterns to detect moderate to severe depression, aiming to improve access to screening in primary care settings.
Study approach: The study analyzed over 14,000 voice samples from U.S. and Canadian adults. Participants answered the question, "How was your day?" with at least 25 seconds of free-form speech. The tool analyzed vocal biomarkers associated with depression, including speech cadence, hesitations, pauses, and other acoustic features. These were compared to results from the Patient Health Questionnaire-9 (PHQ-9), a standard depression screening tool. A PHQ-9 score of 10 or higher indicated moderate to severe depression. The AI tool provided three outputs: Signs of Depression Detected, Signs of Depression Not Detected, and Further Evaluation Recommended (for uncertain cases).
Main results: The dataset used to train the AI model consisted of 10,442 samples, while an additional 4,456 samples were used in a validation set to assess its accuracy.
-
The tool demonstrated a sensitivity of 71%, correctly identifying depression in 71% of people who had it.
-
Specificity was 74%, correctly ruling out depression in 74% of people who did not have it.
Why it matters: The study findings suggest that machine learning technology could serve as a complementary decision-support tool for assessing depression.
Source:
Journal reference:
Mazur, A., et al. (2025) Evaluation of an AI-Based Voice Biomarker Tool to Detect Signals Consistent With Moderate to Severe Depression. The Annals of Family Medicine. doi.org/10.1370/afm.240091.