In a recent article published in Translational Psychiatry, researchers performed a systemic review and meta-analysis of scientific papers using an artificial intelligence (AI)-based tool that uses Natural Language Processing (NLP) to examine mental health interventions (MHI).
Study: Natural language processing for mental health interventions: a systematic review and research framework. Image Credit: MMD Creative/Shutterstock.com
Background
Globally, neuropsychiatric disorders, such as depression and anxiety, pose a significant economic burden on healthcare systems. The financial burden of mental health diseases is estimated to reach six trillion US dollars annually by 2030.
Numerous MHIs, including behavioral, psychosocial, pharmacological, and telemedicine, appear effective in promoting the well-being of affected individuals. However, their inherent systemic issues limit their effectiveness and ability to meet increasing demand.
Moreover, the clinical workforce is scarce, needs extensive training for mental health assessments, the quality of available treatment is variable, and current quality assurance practices cannot handle reduced effect sizes of widespread MHIs.
Given the low quality of MHIs, especially in developing countries, there is a need for more research on developing tools, especially ML-based tools, that facilitate mental health diagnosis and treatment.
NLP facilitates the quantitative study of conversation transcripts and medical records for thousands of patients in no time. It renders words into numeric and graphical representations, a task previously considered unfathomable. More importantly, it could examine the characteristics of providers and patients to detect meaningful trends in large datasets.
Digital health platforms have made MHI data more readily available, making it possible for NLP tools to do many analyses related to studying treatment fidelity, patient outcomes, treatment components, therapeutic alliance, and gauging suicide risk.
Lastly, NLP could analyze social media data and electronic health records (EHRs) in mental health-relevant contexts.
While NLP has shown research potential, the current separation between clinical and computer science researchers has limited its impact on clinical practice.
Thus, even though the use of machine learning in the mental health domain has increased, clinicians have not included peer-reviewed manuscripts from AI conferences reporting advances in NLP.
About the study
In the present study, researchers classified NLP methods deployed to study MHI, identified clinical domains, and used them to aggregate NLP findings.
They examined the main features of the NLP pipeline in each manuscript, including linguistic representations, software packages, classification, and validation methods. Likewise, they evaluated its clinical settings, goals, transcript origin, clinical measures, ground truths, and raters.
Moreover, the researchers evaluated NLP-MHI studies to identify common areas, biases, and knowledge gaps in applying NLP to MHI to propose a research framework that could aid computer and clinical researchers in improving the clinical utility of these tools.
They screened articles on the Pubmed, PsycINFO, and Scopus databases to identify studies focused solely on NLP for human-to-human MHI for assessing mental health, e.g., psychotherapy, patient assessment, psychiatric treatment, crisis counseling, to name a few.
Further, the researchers searched peer-reviewed AI conferences (e.g., Association for Computational Linguistics) through ArXiv and Google Scholar.
They compiled articles that met five criteria:
i) were original empirical studies;
ii) published in English;
iii)peer-reviewed;
iv) MHI-focused; and
v) analyzed MHI-retrieved textual data (e.g., transcripts).
Results
The final sample set comprised 102 studies, primarily involving face-to-face randomized controlled trials (RCTs), conventional treatments, and collected therapy corpora.
Nearly 54% of these studies were published between 2020 and 2022, suggesting a surge in NLP-based methods for MHI applications.
Six clinical categories emerged in the review: two and two for the patients and providers, respectively, and two for patient-provider interactions.
These were clinical presentation, intervention response (for patients), intervention monitoring, provider characteristics (for clinicians), relational dynamics, and conversational topics (interaction). They all operated simultaneously as factors in all treatment outcomes.
While clinicians provided ground truth ratings for 31 studies, patients did so through self-report measures of symptom feedback and treatment alliance ratings for 22 studies. The most prevalent source of provider/patient information was Motivational Interviewing Skills Codes (MISC) annotations.
Multiple NLP approaches emerged, reflecting the temporal development of NLP tools. It shows growth and transformations in patient-provider conversations concerning linguistic representations. Word Embeddings were used the most for language representation, i.e., in 48% of studies.
The two most prevalent NLP model features were lexicons and sentiment analysis, as reflected by their use in 43 and 32 studies. The latter generated feature scores for emotions (e.g., joy) derived from lexicon-based methods.
Eventually, context-sensitive deep neural networks replaced word count and frequency-based lexicon methods in NLP models. A total of 16 studies also used topic modeling to identify common themes across clinical transcripts.
After linguistic content, acoustic characteristics emerged as a promising source of treatment data, with 16 studies examining the same from the speech of patients and providers.
The authors noted that research in this area showed immense progress in mental health diagnoses and treatment specifications. It also remarkably identified the quality of therapeutics for the patient.
Accordingly, they proposed integrating these distinctive contributions into one framework (NLPxMHI) that helped computational and clinical researchers collaborate and outlined novel NLP applications for innovations in mental health services.
Only 40 studies reported demographic information for the dataset used. So, the authors recommended that NLPxMHI researchers document the demographic data of all individuals participating in their models’ training and evaluation.
In addition, they emphasized the over-sampling of underrepresented groups to help address biases and improve the representativeness of NLP models.
Further, they recommended representing treatment as sequential actions to improve the accuracy of intervention studies, emphasizing the importance of timing and context in enriching beneficial effects. Integrating identified clinical categories into a unified model could also help investigators increase the richness of treatment recommendations.
Fewer reviewed studies implemented techniques to enhance interpretability. It likely hindered investigators from interpreting the overall behavior of the NLP models (across inputs).
Nonetheless, ongoing collaboration between clinical and computational domains will slowly fill the gap between interpretability and accuracy through clinical review, model tuning, and generalizability. In the future, it might help outline valid treatment decision rules and fulfill the promise of precision medicine.
Conclusions
Overall, NLP methods have the potential to operationalize MHI. Its proof-of-concept applications have shown promise in addressing systemic challenges.
However, as the NLPxMHI framework bridges research designs and disciplines, it would also require the support of large secure datasets, a common language, and equity checks for continued progress.
The authors anticipate that this could revolutionize the assessments and treatments of mental health diseases.