Can AI be your therapist? Study shows ChatGPT outperforms professionals in key areas

Download PDF Copy

By Tarun Sai LomteReviewed by Lily Ramsey, LLMFeb 17 2025

Researchers explore the implications of AI-assisted mental health care and the future of psychotherapy.

Study: When ELIZA meets therapists: A Turing test for the heart and mind. Image Credit: MUNGKHOOD STUDIO/Shutterstock.com

In a recent study published in PLOS Mental Health, researchers investigated whether responses written by expert therapists and chat generative pretrained transformer 4 (ChatGPT-4) could be differentiated by humans.

“Can machines think?” is a simple question raised by Alan Turing after the Second World War. Technology has markedly advanced since the mid-1900s, and growing evidence shows that generative artificial intelligence (GenAI) could be helpful in psychotherapy.

Moreover, recent studies reveal promising effects of GenAI in psychotherapy as an adjunct or independent solution. Recent reports indicate that AI can write content empathically, which is highly rated by therapists and outperforms professionals.

About the study

In the present study, researchers investigated whether a panel of participants could differentiate responses related to couple therapy from human experts and ChatGPT.

First, experts with advanced degrees in counseling psychology, clinical psychology, psychiatry, and marriage and family therapy were recruited. Experts were randomly assigned to receive one of two sets of couple therapy vignettes. They were given a month to complete responses to vignettes.

After completion, experts from one group ranked the three responses from the other group that were highly likely to succeed on the common factors test and the Turing test and vice versa. Next, ChatGPT-4 was asked with a single prompt to generate responses.

The prompt defined professionalism, empathy, therapeutic alliance, efficacy, and cultural competence. Similarly, ChatGPT-4’s responses were rated by the authors.

The best vignettes were selected to compete with those of human experts. Further, the best responses were aggregated and distributed as a survey to a panel of diverse individuals. This sample of respondents was representative of the United States (US) population.

Respondents were randomized to receive a message from therapists or ChatGPT-4 and asked to 1) rate how it aligned with the common factors and 2) guess whether ChatGPT or a human therapist was the author.

Findings

In total, 13 therapists with at least five years of experience constituted the expert panel. Most of them had backgrounds in couple therapy. Further, the panel of survey respondents included 830 individuals aged 45 years, on average.

Of these, 50.6% were females, 47.9% were males, and 0.2% were non-binary. Nearly 60% of them were in a romantic relationship, and 18% reported ever engaging in a couple therapy.

In addition, 49.4% of respondents were non-Hispanic white, 18.8% were Black, 16.8% were White Hispanic, and 5% were Asian, among others. Survey respondents demonstrated poor performance in terms of identifying whether responses were from ChatGPT or therapists.

They correctly identified therapists only 5% more often than ChatGPT. Further, ChatGPT’s responses were rated higher on all therapeutic common factors than therapists’ responses.

Moreover, responses from ChatGPT were more likely to be categorized as empathic, culturally competent, and connecting than those written by therapists.

Participants who believed that a therapist wrote the response rated it higher, while those who thought it was ChatGPT-generated rated it lower. This prompted an additional post hoc analysis, which revealed a marked attribution bias.

That is, subjects responded more positively to vignettes when attributed to therapists. In addition, responses were rated based on the accuracy of attribution.

For instance, therapists’ responses that were misattributed to ChatGPT had the least favorable ratings. Furthermore, the researchers compared the differences in part-of-speech and sentiment between responses from therapists and ChatGPT.

ChatGPT-generated responses were longer and had more positive sentiment, nouns, adjectives, verbs, pronouns, and adverbs than human-written responses.

Even when controlling for response length, ChatGPT’s responses comprised more adjectives and nouns but had a similar number of adverbs, pronouns, and verbs.

Conclusions

The accurate identification of responses from ChatGPT and therapists was only slightly better than chance.

This indicates that people have difficulty distinguishing machine and human responses, supporting Turing’s prediction (that humans will not be able to tell apart machine and human responses). Besides, responses by ChatGPT were rated much higher on all common factors of therapy than human responses.

The study’s limitations include the limited number of vignettes representing a fraction of what is possible in valid settings, only one prompt to generate GenAI responses, and the limited number of expert therapists, including a couple of therapists.

Given the odds that GenAI could be incorporated into therapeutic settings, mental health experts will need to understand machine learning, become more technically literate in the field, and ensure careful training and supervision of these models.

Journal reference:

Hatch SG, Goodman ZT, Vowels L, et al. (2025) When ELIZA meets therapists: A Turing test for the heart and mind. PLOS Mental Health,. doi: 10.1371/journal.pmen.0000145. https://journals.plos.org/mentalhealth/article?id=10.1371/journal.pmen.0000145

Posted in: Device / Technology News | Medical Science News | Medical Research News | Medical Condition News

Comments (0)

Written by

Tarun Sai Lomte

Tarun is a writer based in Hyderabad, India. He has a Master’s degree in Biotechnology from the University of Hyderabad and is enthusiastic about scientific research. He enjoys reading research papers and literature reviews and is passionate about writing.

Download PDF Copy

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

APA
Sai Lomte, Tarun. (2025, February 17). Can AI be your therapist? Study shows ChatGPT outperforms professionals in key areas. News-Medical. Retrieved on April 13, 2025 from https://www.news-medical.net/news/20250217/Can-AI-be-your-therapist-Study-shows-ChatGPT-outperforms-professionals-in-key-areas.aspx.
MLA
Sai Lomte, Tarun. "Can AI be your therapist? Study shows ChatGPT outperforms professionals in key areas". News-Medical. 13 April 2025. <https://www.news-medical.net/news/20250217/Can-AI-be-your-therapist-Study-shows-ChatGPT-outperforms-professionals-in-key-areas.aspx>.
Chicago
Sai Lomte, Tarun. "Can AI be your therapist? Study shows ChatGPT outperforms professionals in key areas". News-Medical. https://www.news-medical.net/news/20250217/Can-AI-be-your-therapist-Study-shows-ChatGPT-outperforms-professionals-in-key-areas.aspx. (accessed April 13, 2025).
Harvard
Sai Lomte, Tarun. 2025. Can AI be your therapist? Study shows ChatGPT outperforms professionals in key areas. News-Medical, viewed 13 April 2025, https://www.news-medical.net/news/20250217/Can-AI-be-your-therapist-Study-shows-ChatGPT-outperforms-professionals-in-key-areas.aspx.