Cited by Evaluating the quality and readability of ChatGPT-generated patient-facing medical information in rhinology

Performance of Artificial Intelligence Chatbots in Responding to Patient Queries Related to Traumatic Dental Injuries: A Comparative Study DOI

Yeliz Güven, Omer Tarik Ozdemir, Melis Yazır Kavan

и другие.

Dental Traumatology, Год журнала: 2024, Номер unknown

Опубликована: Ноя. 22, 2024

ABSTRACT Background/Aim Artificial intelligence (AI) chatbots have become increasingly prevalent in recent years as potential sources of online healthcare information for patients when making medical/dental decisions. This study assessed the readability, quality, and accuracy responses provided by three AI to questions related traumatic dental injuries (TDIs), either retrieved from popular question‐answer sites or manually created based on hypothetical case scenarios. Materials Methods A total 59 injury queries were directed at ChatGPT 3.5, 4.0, Google Gemini. Readability was evaluated using Flesch Reading Ease (FRE) Flesch–Kincaid Grade Level (FKGL) scores. To assess response quality accuracy, DISCERN tool, Global Quality Score (GQS), misinformation scores used. The understandability actionability analyzed Patient Education Assessment Tool Printed (PEMAT‐P) tool. Statistical analysis included Kruskal–Wallis with Dunn's post hoc test non‐normal variables, one‐way ANOVA Tukey's normal variables ( p < 0.05). Results mean FKGL FRE Gemini 11.2 49.25, 11.8 46.42, 10.1 51.91, respectively, indicating that difficult read required a college‐level reading ability. 3.5 had lowest PEMAT‐P among 0.001). 4.0 rated higher (GQS score 5) compared Conclusions In this study, although widely used, some misleading inaccurate about TDIs. contrast, generated more accurate comprehensive answers, them reliable auxiliary sources. However, complex issues like TDIs, no chatbot can replace dentist diagnosis, treatment, follow‐up care.

Язык: Английский

Процитировано

Evaluation of the Usability of ChatGPT‐4 and Google Gemini in Patient Education About Rhinosinusitis DOI

Çağrı Becerik, Selçuk Yıldız, Çiğdem Tepe Karaca

и другие.

Clinical Otolaryngology, Год журнала: 2025, Номер unknown

Опубликована: Янв. 7, 2025

ABSTRACT Introduction Artificial intelligence (AI) based chat robots are increasingly used by users for patient education about common diseases in the health field, as every field. This study aims to evaluate and compare materials on rhinosinusitis created two frequently robots, ChatGPT‐4 Google Gemini. Method One hundred nine questions taken from information websites were divided into 4 different categories: general knowledge, diagnosis, treatment, surgery complications, then asked robots. The answers given evaluated expert otolaryngologists, where scores different, a third, more experienced otolaryngologist finalised evaluation. Questions scored 1 4: (1) comprehensive/correct, (2) incomplete/partially correct, (3) accurate inaccurate data, potentially misleading (4) completely inaccurate/irrelevant. Results In evaluating ChatGPT‐4, all Diagnosis category comprehensive/correct. evaluation of Gemini, inaccurate/irrelevant treatment found be statistically significantly higher, correct complications higher. comparison between category, had higher rate than Gemini was significant. Conclusion regarding sufficient informative.

Язык: Английский

Процитировано

Is ChatGPT's Knowledge on Rhinology Accurate? Can It Be Utilized in Medical Education and Patient Information? DOI

Ahmet Çağlar ÖZDOĞAN,

Burçay TELLİOĞLU,

Oğuzhan KATAR

и другие.

Research Square (Research Square), Год журнала: 2025, Номер unknown

Опубликована: Янв. 20, 2025

Abstract Background: ChatGPT is a new artificial intelligence model designed to create human-like chat. As result of advancing knowledge and technological improvements, it promising in the field medicine, especially as resource that patients clinicians can apply.Objective: The aim our study measure accuracy consistency ChatGPT's answers questions rhinology.Methods: In March 2024, (ChatGPT version 4) was presented with 130 rhinology. Each question asked twice consistency/reproducibility investigated. were evaluated by three ENT physicians. physicians followed standardised 4-point format (1:Completely correct, 2:Partially 3:A mix accurate inaccurate/misleading, 4:Completely incorrect/ irrelevant).Results: given consistent at rate 91.5%(119/130). Among inconsistent answers, second answer found be more correct 10/11. Statistically, (p: 0011). questions, controllers' evaluation, number completely 99/81/80(76.2%/62.3%/61.5%) respectively. However, incorrect 7/6/7(5.4%/4.6%/5.4%), Accordingly, seen there no statistical difference between controllers(p:0.270).Conclusion: inaccuracy patient information education process considered an acceptable level reliable. also are not give misleading some questions. We believe would safer use informative educational material for control experts.

Язык: Английский

Процитировано

Examination of the Quality and Readability of Chatbot Responses to Patient Questions: A Synthesis of Recent Studies (Preprint) DOI

Peter Whittaker,

Mengyan Sun

Опубликована: Фев. 4, 2025

BACKGROUND Patient use of chatbots to obtain medical information has been anticipated with both optimism and pessimism. The simplicity asking questions receiving immediate answers prompted investigators examine the quality readability chatbot responses. We sought review current results at this nascent stage in development. OBJECTIVE To evaluate data. METHODS searched multiple databases identify studies that evaluated response using DISCERN instrument; designed assess written material intended for patients. From these studies, we extracted scores, number words used questions, asked, evaluators, and, if recorded, also examined a measure rank journals which were published. combined parameters linear regression model determine potential associations quality. RESULTS identified 32 conducted 57 tests chatbots. average prompts ranged from 6 41, 3 119. As increased, decreased. Forty-two percent produced responses ranked as “good” or higher, only one test was below college-level readability. An increased score associated prompt simple regression. In model, higher scores three more inversely journal rank, but not words. CONCLUSIONS variable poor patient reinforces pessimism about their role. However, principles engineering (the art questions) have yet be rigorously applied. Therefore, remain optimistic will improve.

Язык: Английский

Процитировано

Leveraging artificial intelligence chatbots for anemia prevention: A comparative study of ChatGPT-3.5, copilot, and Gemini outputs against Google Search results DOI

Shinya Ito, Emi Furukawa, Tsuyoshi Okuhara

и другие.

PEC Innovation, Год журнала: 2025, Номер unknown, С. 100390 - 100390

Опубликована: Апрель 1, 2025

Язык: Английский

Процитировано

Evaluating the quality and readability of ChatGPT-generated patient-facing medical information in rhinology DOI

Alexander Z. Fazilat,

Camille Brenac,

Danae Kawamoto-Duran

и другие.

European Archives of Oto-Rhino-Laryngology, Год журнала: 2024, Номер unknown

Опубликована: Дек. 26, 2024

Язык: Английский

Процитировано