Study shows that a machine learning-based predictive model can be trained using telephone conversations to identify early signs of Alzheimer’s disease.
Elderly Man on the Telephone. Image Credit: fizkes/Shutterstock.com
A predictive model using vocal features to identify the risk of Alzheimer’s disease
The diagnosis of Alzheimer’s disease is difficult due to the wide range of symptoms, person-to-person variability, diagnostic costs, and varying capability of health centers. Nonetheless, identifying individuals at risk for Alzheimer’s disease in the early phases of detection is key to alleviate the burden of Alzheimer’s disease among patients and caregivers and to begin treatment.
In recent years, amounting evidence has demonstrated the greater accuracy and efficiency of predictive models that use machine-learning algorithms. Many models have been developed for clinical purposes including Alzheimer’s disease prediction using clinical health data, but have shown varying effectiveness.
To build upon and refine existing models, Akihiro Shimoda and colleagues from the Department of Public Health at McCann Healthcare Worldwide Japan Inc. based in Tokyo, Japan, have developed a new and improved machine learning model using vocal features to identify early signs of Alzheimer’s disease. Their study is published in the journal PLOS One.
The use of vocal features is key as clinical evidence suggests that patients affected by Alzheimer’s disease are more likely to speak slowly, with long pauses, and spend time finding correct words, which results in broken messages that show a distinct lack of speech fluency.
To develop accurate predictive models, researchers used 1,465 audio data files from 99 healthy individuals as well as 151 audio data files recorded from 24 patients diagnosed with Alzheimer’s disease, which originates from a dementia prevention program in Hachioji City in Tokyo. The data originated from telephone conversations sampled between March and May of 2020 from individuals aged 65 or older.
From the audio files, researchers developed machine learning models based on extreme gradient boosting (XGBoost), random forest (RF), and logistic regression (LR), using each audio file as one single observation. The predictive performance of the models was then assessed by the receiver operating characteristic (ROC) curve, which provided indications of effectiveness, sensitivity, and specificity.
The results were then compared to multiple machine learning algorithms with conventional cognitive tests.
Models showed a similar predictive performance to other diagnostic tests
From the machine-learning models, results showed that the predictive accuracy was not significantly different in diagnostic capability in comparison to other diagnostic tests.
However, several limitations require consideration. Only binary samples were used as individuals were either healthy or diagnosed, with a limited range of clinical diagnoses that could provide more effective identification of disease severity. This can be addressed in future studies that could use retrospective data of patients with limited, mild, or severe symptoms.
Moreover, predictive power was limited by sample size and audio quality, which could be extended further to encompass a larger population sample and a standardized sampling quality. Additionally, only superficial vocal features were used for analysis, which could lead to a loss of information and misidentification if individuals spoke with accents or had unusual pronunciation.
Nevertheless, the accuracy of the developed models was supported by large training datasets that provided key indications of differential vocal features between healthy and diagnosed patients. Findings, therefore, demonstrate that this novel prediction model using daily phone conversations has promising potential for assessing the risk of Alzheimer’s disease. In the future, studies could expand on this concept to include a larger and more diverse sample population to further improve the models of the present study.
Journal reference:
- Shimoda A, Li Y, Hayashi H, Kondo N (2021) Dementia risks identified by vocal features via telephone conversations: A novel machine learning prediction model. PLoS ONE 16(7): e0253988. https://doi.org/10.1371/journal.pone.0253988