Evaluating the effectiveness of large language models in patient education for conjunctivitis DOI
Jingyuan Wang, Runhan Shi, Qihua Le

et al.

British Journal of Ophthalmology, Journal Year: 2024, Volume and Issue: unknown, P. bjo - 325599

Published: Aug. 30, 2024

To evaluate the quality of responses from large language models (LLMs) to patient-generated conjunctivitis questions.

Language: Английский

Evaluating the Reliability of ChatGPT for Health-Related Questions: A Systematic Review DOI Creative Commons
Mohammad Beheshti, Imad Eddine Toubal, Khuder Alaboud

et al.

Informatics, Journal Year: 2025, Volume and Issue: 12(1), P. 9 - 9

Published: Jan. 17, 2025

The rapid advancement of large language models like ChatGPT has significantly impacted natural processing, expanding its applications across various fields, including healthcare. However, there remains a significant gap in understanding the consistency and reliability ChatGPT’s performance different medical domains. We conducted this systematic review according to an LLM-assisted PRISMA setup. high-recall search term “ChatGPT” yielded 1101 articles from 2023 onwards. Through dual-phase screening process, initially automated via subsequently manually by human reviewers, 128 studies were included. covered range specialties, focusing on diagnosis, disease management, patient education. assessment metrics varied, but most compared accuracy against evaluations clinicians or reliable references. In several areas, demonstrated high accuracy, underscoring effectiveness. some contexts revealed lower accuracy. mixed outcomes domains emphasize challenges opportunities integrating AI into certain areas suggests that substantial utility, yet inconsistent all indicates need for ongoing evaluation refinement. This highlights potential improve healthcare delivery alongside necessity continued research ensure reliability.

Language: Английский

Citations

1

The large language model diagnoses tuberculous pleural effusion in pleural effusion patients through clinical feature landscapes DOI Creative Commons

Chaoling Wu,

Wanyi Liu,

Peng-jin Mei

et al.

Respiratory Research, Journal Year: 2025, Volume and Issue: 26(1)

Published: Feb. 12, 2025

Tuberculous pleural effusion (TPE) is a challenging extrapulmonary manifestation of tuberculosis, with traditional diagnostic methods often involving invasive surgery and being time-consuming. While various machine learning statistical models have been proposed for TPE diagnosis, these are typically limited by complexities in data processing difficulties feature integration. Therefore, this study aims to develop model using ChatGPT-4, large language (LLM), compare its performance logistic regression models. By highlighting the advantages LLMs handling complex clinical data, identifying interrelationships between features, improving accuracy, seeks provide more efficient precise solution early diagnosis TPE. We conducted cross-sectional study, collecting from 109 54 non-TPE patients analysis, selecting 73 features over 600 initial variables. The LLM was compared (k-Nearest Neighbors, Random Forest, Support Vector Machines) metrics like area under curve (AUC), F1 score, sensitivity, specificity. showed comparable models, outperforming specificity, overall accuracy. Key such as adenosine deaminase (ADA) levels monocyte percentage were effectively integrated into model. also developed Python package ( https://pypi.org/project/tpeai/ ) rapid based on data. LLM-based offers non-surgical, accurate, cost-effective method diagnosis. provides user-friendly tool clinicians, potential broader use. Further validation larger datasets needed optimize application.

Language: Английский

Citations

1

Assessing the reliability of ChatGPT: a content analysis of self-generated and self-answered questions on clear aligners, TADs and digital imaging DOI Creative Commons
Orlando Motohiro Tanaka, Gil Guilherme Gasparello, Giovani Ceron Hartmann

et al.

Dental Press Journal of Orthodontics, Journal Year: 2023, Volume and Issue: 28(5)

Published: Jan. 1, 2023

ABSTRACT Introduction: Artificial Intelligence (AI) is a tool that already part of our reality, and this an opportunity to understand how it can be useful in interacting with patients providing valuable information about orthodontics. Objective: This study evaluated the accuracy ChatGPT accurate quality answer questions on Clear aligners, Temporary anchorage devices Digital imaging Methods: forty-five answers were generated by 4.0, analyzed separately five orthodontists. The evaluators independently rated provided Likert scale, which higher scores indicated greater (1 = very poor; 2 3 acceptable; 4 good; 5 good). Kruskal-Wallis H test (p< 0.05) post-hoc pairwise comparisons Bonferroni correction performed. Results: From 225 evaluations different evaluators, 11 (4.9%) considered as poor, (1.8%) 15 (6.7%) acceptable. majority good [34 (15,1%)] [161 (71.6%)]. Regarding evaluators’ scores, slight agreement was perceived, Fleiss’s Kappa equal 0.004. Conclusions: has proven effective related clear temporary devices, digital within context interest

Language: Английский

Citations

20

Enhancing Health Literacy: Evaluating the Readability of Patient Handouts Revised by ChatGPT's Large Language Model DOI
Austin R. Swisher, Arthur W. Wu,

Godfrey K. F. Liu

et al.

Otolaryngology, Journal Year: 2024, Volume and Issue: 171(6), P. 1751 - 1757

Published: Aug. 6, 2024

To use an artificial intelligence (AI)-powered large language model (LLM) to improve readability of patient handouts.

Language: Английский

Citations

8

Evaluating Chat Generative Pre-trained Transformer Responses to Common Pediatric In-toeing Questions DOI
Jason Zarahi Amaral, Rebecca J. Schultz, Benjamin M. Martin

et al.

Journal of Pediatric Orthopaedics, Journal Year: 2024, Volume and Issue: 44(7), P. e592 - e597

Published: April 30, 2024

Objective: Chat generative pre-trained transformer (ChatGPT) has garnered attention in health care for its potential to reshape patient interactions. As patients increasingly rely on artificial intelligence platforms, concerns about information accuracy arise. In-toeing, a common lower extremity variation, often leads pediatric orthopaedic referrals despite observation being the primary treatment. Our study aims assess ChatGPT’s responses in-toeing questions, contributing discussions innovation and technology education. Methods: We compiled list of 34 questions from “Frequently Asked Questions” sections 9 care–affiliated websites, identifying 25 as most encountered. On January 17, 2024, we queried ChatGPT 3.5 separate sessions recorded responses. These were posed again 21, reproducibility. Two surgeons evaluated using scale “excellent (no clarification)” “unsatisfactory (substantial clarification).” Average ratings used when evaluators’ grades within one level each other. In discordant cases, senior author provided decisive rating. Results: found 46% “excellent” 44% “satisfactory (minimal addition, 8% cases (moderate 2% “unsatisfactory.” Questions had appropriate readability, with an average Flesch-Kincaid Grade Level 4.9 (±2.1). However, at collegiate level, averaging 12.7 (±1.4). No significant differences observed between question topics. Furthermore, exhibited moderate consistency after repeated queries, evidenced by Spearman rho coefficient 0.55 ( P = 0.005). The chatbot appropriately described normal or spontaneously resolving 62% consistently recommended evaluation provider 100%. Conclusion: presented serviceable, though not perfect, representation diagnosis management while demonstrating reproducibility utility could be enhanced improving readability incorporating evidence-based guidelines. Evidence: IV—diagnostic.

Language: Английский

Citations

7

Understanding natural language: Potential application of large language models to ophthalmology DOI Creative Commons
Zefeng Yang, Biao Wang, Fengqi Zhou

et al.

Asia-Pacific Journal of Ophthalmology, Journal Year: 2024, Volume and Issue: 13(4), P. 100085 - 100085

Published: July 1, 2024

Large language models (LLMs), a natural processing technology based on deep learning, are currently in the spotlight. These closely mimic comprehension and generation. Their evolution has undergone several waves of innovation similar to convolutional neural networks. The transformer architecture advancement generative artificial intelligence marks monumental leap beyond early-stage pattern recognition via supervised learning. With expansion parameters training data (terabytes), LLMs unveil remarkable human interactivity, encompassing capabilities such as memory retention comprehension. advances make particularly well-suited for roles healthcare communication between medical practitioners patients. In this comprehensive review, we discuss trajectory their potential implications clinicians For clinicians, can be used automated documentation, given better inputs extensive validation, may able autonomously diagnose treat future. patient care, triage suggestions, summarization documents, explanation patient's condition, customizing education materials tailored level. limitations possible solutions real-world use also presented. Given rapid advancements area, review attempts briefly cover many that play ophthalmic space, with focus improving quality delivery.

Language: Английский

Citations

7

Rare disease diagnosis using knowledge guided retrieval augmentation for ChatGPT DOI

Charlotte Zelin,

Wendy K. Chung, Médéric Jeanne

et al.

Journal of Biomedical Informatics, Journal Year: 2024, Volume and Issue: 157, P. 104702 - 104702

Published: July 29, 2024

Language: Английский

Citations

7

Comprehensiveness of Large Language Models in Patient Queries on Gingival and Endodontic Health DOI Creative Commons
Qian Zhang,

Zhengyu Wu,

Jinlin Song

et al.

International Dental Journal, Journal Year: 2024, Volume and Issue: unknown

Published: Aug. 1, 2024

Given the increasing interest in using large language models (LLMs) for self-diagnosis, this study aimed to evaluate comprehensiveness of two prominent LLMs, ChatGPT-3.5 and ChatGPT-4, addressing common queries related gingival endodontic health across different contexts query types.

Language: Английский

Citations

7

Development of a novel scoring system for glaucoma risk based on demographic and laboratory factors using ChatGPT-4 DOI
Joon Yul Choi, Tae Keun Yoo

Medical & Biological Engineering & Computing, Journal Year: 2024, Volume and Issue: unknown

Published: Aug. 12, 2024

Language: Английский

Citations

6

Review of emerging trends and projection of future developments in large language models research in ophthalmology DOI
Matthew Wong, Zhi Wei Lim, Krithi Pushpanathan

et al.

British Journal of Ophthalmology, Journal Year: 2023, Volume and Issue: 108(10), P. 1362 - 1370

Published: Dec. 11, 2023

Large language models (LLMs) are fast emerging as potent tools in healthcare, including ophthalmology. This systematic review offers a twofold contribution: it summarises current trends ophthalmology-related LLM research and projects future directions for this burgeoning field.

Language: Английский

Citations

15