AI outperforms clinicians in satisfaction ratings for medical advice responses

Researchers at Stanford University found that AI-generated responses to patient messages achieved higher satisfaction rates than clinician responses, though empathy and quality remained strong in endocrinology.

Research Letter: Perspectives on Artificial Intelligence–Generated Responses to Patient Messages. Image Credit: Munthita / ShutterstockResearch Letter: Perspectives on Artificial Intelligence–Generated Responses to Patient Messages. Image Credit: Munthita / Shutterstock

In a recent study published in JAMA Network Open, researchers at Stanford University evaluated the satisfaction of laypeople with artificial intelligence (AI) responses relative to clinician-to-patient messages. Generative AI can potentially help clinicians respond to patients' messages. While AI-generated responses exhibit acceptable quality and a low risk of harm, the perspectives of laypersons toward AI-generated responses have been rarely explored in detail.

The study and findings

In this cross-sectional study, researchers investigated laypersons’ satisfaction with AI-generated responses compared to clinician-to-patient messages. They screened 3,769,023 patient medical advice requests in health records and included 59 clinical questions for analysis. Two generative AI models were used: Stanford Generative Pretrained Transformer (GPT) and ChatGPT-4. These tools generated responses with and without prompt engineering. For the final analysis, AI responses generated with prompt engineering were selected for higher-quality information and empathy.

Six licensed clinicians investigated the original clinician responses as well as AI responses on a five-point Likert scale, with 5 indicating the best and 1 indicating the worst. Additionally, 30 participants, recruited through the Stanford Research Registry, assessed AI and clinician responses for satisfaction. Each response was independently evaluated by three persons, with a score of 5 being extremely satisfied and 1 being extremely dissatisfied. To account for the potential biases and variability of evaluators, the researchers developed mixed models to calculate effect estimates for empathy, satisfaction, and information quality.

The team used multivariable linear regression to investigate associations between response length and satisfaction, adjusting for sex, age, race, and ethnicity. Overall, 2,118 assessments of AI response quality and 408 assessments of satisfaction were included. Notably, satisfaction estimates for AI responses (mean 3.96) were significantly higher than for clinician responses (mean 3.05), both overall and by specialty. The highest satisfaction estimates were for AI responses to cardiology questions, whereas responses to endocrinology questions showed the highest empathy and information quality.

Clinician responses were shorter, with an average of 254 characters, compared to AI responses, which averaged 1,471 characters. Interestingly, the length of clinician responses was associated with satisfaction, particularly in cardiology questions, whereas no such association was found for AI response length.

Conclusions

The study assessed satisfaction with AI responses to patients’ questions in health records. The findings showed that AI-generated responses had consistently higher satisfaction than clinician responses. However, satisfaction was not necessarily concordant with information quality and empathy, as responses to cardiology questions had the highest satisfaction, but endocrinology questions were rated highest in empathy and information quality.

Further, the length of clinician responses, but not AI’s, was associated with satisfaction, suggesting that brevity in clinician-patient communication might lower satisfaction. The study’s limitations include the assessment of satisfaction by survey participants rather than by the patients who originally submitted the questions. Thus, original patients' satisfaction might differ.

Future studies should assess satisfaction with AI responses across various settings, including different medical centers, regions, patient populations, and specialties. Overall, the study underscores the importance of patients as stakeholders in developing and implementing AI in clinician-patient communications for optimal integration into practice.

Journal reference:
Tarun Sai Lomte

Written by

Tarun Sai Lomte

Tarun is a writer based in Hyderabad, India. He has a Master’s degree in Biotechnology from the University of Hyderabad and is enthusiastic about scientific research. He enjoys reading research papers and literature reviews and is passionate about writing.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Sai Lomte, Tarun. (2024, October 17). AI outperforms clinicians in satisfaction ratings for medical advice responses. News-Medical. Retrieved on January 20, 2025 from https://www.news-medical.net/news/20241017/AI-outperforms-clinicians-in-satisfaction-ratings-for-medical-advice-responses.aspx.

  • MLA

    Sai Lomte, Tarun. "AI outperforms clinicians in satisfaction ratings for medical advice responses". News-Medical. 20 January 2025. <https://www.news-medical.net/news/20241017/AI-outperforms-clinicians-in-satisfaction-ratings-for-medical-advice-responses.aspx>.

  • Chicago

    Sai Lomte, Tarun. "AI outperforms clinicians in satisfaction ratings for medical advice responses". News-Medical. https://www.news-medical.net/news/20241017/AI-outperforms-clinicians-in-satisfaction-ratings-for-medical-advice-responses.aspx. (accessed January 20, 2025).

  • Harvard

    Sai Lomte, Tarun. 2024. AI outperforms clinicians in satisfaction ratings for medical advice responses. News-Medical, viewed 20 January 2025, https://www.news-medical.net/news/20241017/AI-outperforms-clinicians-in-satisfaction-ratings-for-medical-advice-responses.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of News Medical.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.