AI chatbots outperform doctors in empathy and readability for cancer-related questions, study finds

In a recent study published in JAMA Oncology, researchers compared online conversational artificial intelligence (AI) chatbot replies to cancer-related inquiries to those of licensed physicians concerning empathy, response quality, and readability.

Digital oncology solutions can help to cut expenses, enhance patient care outcomes, and minimize physician burnout. AI has produced significant advances in healthcare delivery, notably conversational AI-based chatbots that inform cancer patients about clinical diagnoses and treatment options. However, the potential of AI chatbots to produce replies based on cancer knowledge has yet to be validated. Interest in deploying these technological advancements in patient-facing roles is considerable, but their medical accuracy, empathy, and readability remain unknown. According to recent studies, chatbot replies are more empathic than physician replies to general medical inquiries online.

Brief Report: Physician and Artificial Intelligence Chatbot Responses to Cancer Questions From Social Media. Image Credit: Jirsak / ShutterstockBrief Report: Physician and Artificial Intelligence Chatbot Responses to Cancer Questions From Social Media. Image Credit: Jirsak / Shutterstock

About the study

In the present equivalence study, researchers examined numerous cutting-edge chatbots utilizing pilot parameters of response readability, empathy, and quality to assess chatbot competence in answering oncology-related patient concerns. They investigated the ability of three artificial intelligence chatbots, i.e., GPT-3.50 (first chatbot), GPT-4.0 (second chatbot), and Claude AI (third chatbot), to provide high-quality, sympathetic, and legible replies to cancer-related inquiries from patients.

The researchers compared AI chatbot replies with responses from six confirmed doctors to 200 cancer-related queries posed by patients in a public forum. They collected data on May 31, 2023. The research exposures comprised 200 patient cancer-related inquiries sent online to three AI chatbots between January 1, 2018, and May 31, 2023.

The primary study outcomes included pilot evaluations for readability, empathy, and quality on Likert scales ranging between 1.0 (extremely poor) and 5.0 (very good). Physicians from radiation oncology, medical oncology, and palliative and supportive care graded quality, empathy, and readability. The secondary outcome was readability, measured using Flesch-Kincaid Grade Level (FKGL) scores, Gunning-Fog Index, and Automated Readability Index.

The researchers assessed reading comprehension cognitive load using mean dependency distances for syntactic complexities and textual lexical diversities. They offered recommendations to limit the length of the chatbot response to the average physician response word count (125). Each question's responses were blindfolded and sorted at random. They conducted a one-way analysis of variance (ANOVA) with post-hoc tests to evaluate 200 readability, empathy, and quality ratings and 90 readability metrics between chatbot and physician replies. They used Pearson correlation coefficients to assess the relationships between measures.

Results

Researchers consistently scored chatbot replies higher concerning empathy, quality, and readability in writing styles. Responses created by chatbots 1, 2, and 3 were consistently superior on mean response quality component measures, such as medical correctness, completeness, focus, and quality, compared to physician responses. Similarly, chatbot replies scored higher on the component and overall empathy measures than physician replies.

Responses to 200 questions generated by chatbot 3, the highest-rated artificial intelligence chatbot, were regularly evaluated higher on overall criteria of quality, empathy, and readability than physician responses with mean values of 3.6 (vs. 3.0), 3.56 (vs. 2.4) and 3.8 (vs. 3.1), respectively. The mean Flesch-Kincaid grade level of physician replies (mean, 10.1) was not significantly different from the third chatbot's responses (mean, 10.3), although it was lower than that of the first (mean, 12.3) and second chatbots (mean, 11.3).

Physician replies scored lower in FKGL, showing a greater degree of estimated readability than chatbot responses, implying that chatbot responses may be more tedious to read due to word and phrase length. The mean number of words in the third chatbot replies was higher than that of physician responses (136 vs. 125), but there was no significant difference between the first chatbot (mean, 136) and the second chatbot (mean, 140) replies. Researchers observed word count robustly associated with evaluations of answer quality provided by physicians, the first and second chatbots, and empathy ratings for physician replies and the third chatbot responses.

Despite word count regulation efforts, only the third chatbot response showed higher word counts than physician replies. The first (mean, 12) and second chatbot replies (mean, 11) had considerably higher FKGL ratings than physician replies (mean, 10), whereas the third chatbot replies (mean, 10) were comparable to physician responses. However, physician replies had a 19% lower readability rating (mean, 3.1) than chatbot 3, the best-performing chatbot (mean, 3.8).

The study showed that conversational AI chatbots may deliver high-quality, sympathetic, and legible replies to patient inquiries comparable to those provided by physicians. Future studies should examine chatbot-mediated interaction breadth, process integration, and results. Specialized AI chatbots trained in big medical text corpora might support cancer patients emotionally and improve oncology care. They may also serve as point-of-care digital health tools and offer information to vulnerable groups. Researchers must establish future standards in randomized controlled trials to ensure proper monitoring and results for clinicians and patients. The higher empathy of chatbot replies may stimulate healthcare partnerships.

Journal reference:
Pooja Toshniwal Paharia

Written by

Pooja Toshniwal Paharia

Pooja Toshniwal Paharia is an oral and maxillofacial physician and radiologist based in Pune, India. Her academic background is in Oral Medicine and Radiology. She has extensive experience in research and evidence-based clinical-radiological diagnosis and management of oral lesions and conditions and associated maxillofacial disorders.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Toshniwal Paharia, Pooja Toshniwal Paharia. (2024, May 20). AI chatbots outperform doctors in empathy and readability for cancer-related questions, study finds. News-Medical. Retrieved on December 11, 2024 from https://www.news-medical.net/news/20240520/AI-chatbots-outperform-doctors-in-empathy-and-readability-for-cancer-related-questions-study-finds.aspx.

  • MLA

    Toshniwal Paharia, Pooja Toshniwal Paharia. "AI chatbots outperform doctors in empathy and readability for cancer-related questions, study finds". News-Medical. 11 December 2024. <https://www.news-medical.net/news/20240520/AI-chatbots-outperform-doctors-in-empathy-and-readability-for-cancer-related-questions-study-finds.aspx>.

  • Chicago

    Toshniwal Paharia, Pooja Toshniwal Paharia. "AI chatbots outperform doctors in empathy and readability for cancer-related questions, study finds". News-Medical. https://www.news-medical.net/news/20240520/AI-chatbots-outperform-doctors-in-empathy-and-readability-for-cancer-related-questions-study-finds.aspx. (accessed December 11, 2024).

  • Harvard

    Toshniwal Paharia, Pooja Toshniwal Paharia. 2024. AI chatbots outperform doctors in empathy and readability for cancer-related questions, study finds. News-Medical, viewed 11 December 2024, https://www.news-medical.net/news/20240520/AI-chatbots-outperform-doctors-in-empathy-and-readability-for-cancer-related-questions-study-finds.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of News Medical.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
Accelerated aging linked to early-onset colorectal cancer, study shows