Can ChatGPT be a diabetes consultant? Study probes the potential and pitfalls

In a recent study published in the journal PLoS ONE, researchers tested chatGPT, a language model geared for discussion, to investigate whether it could answer frequently asked diabetes questions.

Artificial intelligence (AI), particularly ChatGPT, has gained significant attention for its potential clinical applications. Despite not being trained explicitly for this domain, ChatGPT has millions of active users globally. Studies have reported that individuals are more amenable to AI-based solutions for low-risk scenarios, with greater acceptance rates. This necessitates more study into the understanding and use of large language-based models like ChatGPT in routine circumstances and regular clinical treatment.

Study: hatGPT- versus human-generated answers to frequently asked questions about diabetes: A Turing test-inspired survey among employees of a Danish diabetes center. ​​​​​​​Image Credit: Andrey_Popov / Shutterstock​​​​​​​Study: ChatGPT- versus human-generated answers to frequently asked questions about diabetes: A Turing test-inspired survey among employees of a Danish diabetes center. ​​​​​​​Image Credit: Andrey_Popov / Shutterstock

About the study

In the present study, researchers evaluated ChatGPT's expertise in diabetes, especially the capacity to answer commonly requested questions related to diabetes in a similar manner as humans.

The researchers specifically explored whether participants with diabetes expertise ranging from some to expert could distinguish between replies provided by people and those written by ChatGPT to answer common queries regarding diabetes. Furthermore, the researchers explored whether individuals with prior interactions with diabetes patients as health providers and individuals who had previously used ChatGPT were better at detecting ChatGPT-generated replies.

The study includes a closed Turing test-inspired computerized survey of all Steno Diabetes Center Aarhus (SDCA) workers (part-time or full-time). The poll included 10 multiple-choice-type queries with two types of answers, one authored by humans and the other produced by ChatGPT, besides questions on age, gender, and past contact with ChatGPT users. The participants had to recognize the ChatGPT-generated answer.

The pathophysiological processes, therapy, complications, physical activity, and food were all addressed in the ten questions. The 'Frequently Asked Questions' section of the Diabetes Association of Denmark's website, viewed on 10 January 2023, included eight questions. The researchers designed the remaining questions to correlate to particular lines on the 'Knowledge Center for Diabetes website and a report on physical activity and diabetes mellitus type 1.

Logistic regression modeling was performed for the analysis, and the odds ratios (ORs) were determined. The team evaluated the influence of participant characteristics on the outcome in the secondary analysis. Based on precise simulations, a non-inferiority margin of 55% was pre-defined and publicized as part of the research protocol before data collection began. In the case of human-written responses, they were directly pulled from materials or source websites from which the team identified the queries.

For practical reasons, two researchers, both health experts, trimmed a few responses to attain the desired word count. Before incorporating the questions, the context along with three samples (selected randomly from 13 pairs of questions and answers) were supplied to the AI-based language model in the prompts, with every question asked in the individual chat windows. Individuals were invited by e-mail, which included person-specific URLs that allowed them to complete the survey once. The information was gathered between January 23 and 27, 2023.

Results

Of the 311 invited persons, 183 completed the survey (59% response rate), with 70% (n=129) being female, 64% had heard of ChatGPT previously, 19% had used it, and 58% (n=107) had past interaction with diabetes patients as health practitioners. The AI-based language model was directed to provide 45-to-65-word answers to match human responses; however, the average word count was 70. However, consultation recommendations and the first three lines of the questions were removed, and the ChatGPT answers were considered to comprise 56 words (average).

Across the 10 questions, the proportion of correct responses ranged from 38% to 74%. Participants correctly identified ChatGPT-generated replies 60% of the time, which was over the non-inferiority threshold. Males and females had 64% and 58% chances of accurately recognizing the artificial intelligence-generated response, respectively. Individuals who had past contact with diabetes patients had a 61% chance of precisely answering the questions, compared to 57% for those who had no prior contact with diabetes patients.

Previous ChatGPT usage showed the most robust connection with the outcome (OR, 1.5) among participant characteristics. An odds ratio of comparable size was observed for the model in which age beyond 50 years was associated with a higher likelihood of correctly recognizing the artificial intelligence-generated response (OR, 1.3). Previous chatGPT users and non-users correctly answered 67% and 58% of the questions, respectively. In contrast to the initial premise, participants could discern between ChatGPT-generated and human-written replies better than tossing a fair coin.

Conclusion

Overall, the study serves as an initial exploration into the capabilities and limitations of ChatGPT in providing patient-centered guidance for chronic disease management, specifically diabetes. While ChatGPT demonstrated some potential for accurately answering frequently asked questions, issues around misinformation and the lack of nuanced, personalized advice were evident. As large language models increasingly intersect with healthcare, rigorous studies are essential to evaluate their safety, efficacy, and ethical considerations in patient care, emphasizing the need for robust regulatory frameworks and continuous oversight.

 
Journal reference:
Pooja Toshniwal Paharia

Written by

Pooja Toshniwal Paharia

Pooja Toshniwal Paharia is an oral and maxillofacial physician and radiologist based in Pune, India. Her academic background is in Oral Medicine and Radiology. She has extensive experience in research and evidence-based clinical-radiological diagnosis and management of oral lesions and conditions and associated maxillofacial disorders.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Toshniwal Paharia, Pooja Toshniwal Paharia. (2023, September 05). Can ChatGPT be a diabetes consultant? Study probes the potential and pitfalls. News-Medical. Retrieved on November 21, 2024 from https://www.news-medical.net/news/20230905/Can-ChatGPT-be-a-diabetes-consultant-Study-probes-the-potential-and-pitfalls.aspx.

  • MLA

    Toshniwal Paharia, Pooja Toshniwal Paharia. "Can ChatGPT be a diabetes consultant? Study probes the potential and pitfalls". News-Medical. 21 November 2024. <https://www.news-medical.net/news/20230905/Can-ChatGPT-be-a-diabetes-consultant-Study-probes-the-potential-and-pitfalls.aspx>.

  • Chicago

    Toshniwal Paharia, Pooja Toshniwal Paharia. "Can ChatGPT be a diabetes consultant? Study probes the potential and pitfalls". News-Medical. https://www.news-medical.net/news/20230905/Can-ChatGPT-be-a-diabetes-consultant-Study-probes-the-potential-and-pitfalls.aspx. (accessed November 21, 2024).

  • Harvard

    Toshniwal Paharia, Pooja Toshniwal Paharia. 2023. Can ChatGPT be a diabetes consultant? Study probes the potential and pitfalls. News-Medical, viewed 21 November 2024, https://www.news-medical.net/news/20230905/Can-ChatGPT-be-a-diabetes-consultant-Study-probes-the-potential-and-pitfalls.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of News Medical.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
Intensive blood pressure treatment reduces cardiovascular risk in people with Type 2 diabetes