The Potential of ChatGPT as a Source of Information for Kidney Transplant Recipients and Their Caregivers DOI Open Access
Kaan Can Demirbaş, Seha Saygılı, Esra Karabağ Yılmaz

et al.

Pediatric Transplantation, Journal Year: 2025, Volume and Issue: 29(3)

Published: March 13, 2025

ABSTRACT Background Education and enhancing the knowledge of adolescents who will undergo kidney transplantation are among primary objectives their care. While there specific interventions in place to achieve this, they require extensive resources. The rise large language models like ChatGPT‐3.5 offers potential assistance for providing information patients. This study aimed evaluate accuracy, relevance, safety ChatGPT‐3.5's responses patient‐centered questions about pediatric transplantation. objective was assess whether could be a supplementary educational tool caregivers complex medical context. Methods A total 37 were presented ChatGPT‐3.5, which prompted respond as health professional would layperson. Five nephrologists independently evaluated outputs comprehensiveness, understandability, readability, safety. Results mean relevancy, comprehensiveness scores all 4.51, 4.56, 4.55, respectively. Out outputs, four rated completely accurate, seven relevant comprehensive. Only one output had an score below 4. Twelve considered potentially risky, but only three risk grade moderate or higher. Outputs that risky accuracy relevancy average. Conclusion Our findings suggest ChatGPT useful individuals waiting However, presence underscores necessity human oversight validation.

Language: Английский

The application of large language models in medicine: A scoping review DOI Creative Commons
Xiangbin Meng,

Xiangyu Yan,

Kuo Zhang

et al.

iScience, Journal Year: 2024, Volume and Issue: 27(5), P. 109713 - 109713

Published: April 23, 2024

This study systematically reviewed the application of large language models (LLMs) in medicine, analyzing 550 selected studies from a vast literature search. LLMs like ChatGPT transformed healthcare by enhancing diagnostics, medical writing, education, and project management. They assisted drafting documents, creating training simulations, streamlining research processes. Despite their growing utility diagnosis improving doctor-patient communication, challenges persisted, including limitations contextual understanding risk over-reliance. The surge LLM-related indicated focus on patient but highlighted need for careful integration, considering validation, ethical concerns, balance with traditional practice. Future directions suggested multimodal LLMs, deeper algorithmic understanding, ensuring responsible, effective use healthcare.

Language: Английский

Citations

62

Assessment of a Large Language Model’s Responses to Questions and Cases About Glaucoma and Retina Management DOI Creative Commons
Andy Huang,

Kyle Hirabayashi,

Laura Barna

et al.

JAMA Ophthalmology, Journal Year: 2024, Volume and Issue: 142(4), P. 371 - 371

Published: Feb. 22, 2024

Large language models (LLMs) are revolutionizing medical diagnosis and treatment, offering unprecedented accuracy ease surpassing conventional search engines. Their integration into assistance programs will become pivotal for ophthalmologists as an adjunct practicing evidence-based medicine. Therefore, the diagnostic treatment of LLM-generated responses compared with fellowship-trained can help assess their validate potential utility in ophthalmic subspecialties.

Language: Английский

Citations

49

Accuracy of an Artificial Intelligence Chatbot’s Interpretation of Clinical Ophthalmic Images DOI
Andrew Mihalache, Ryan S. Huang, Marko Popović

et al.

JAMA Ophthalmology, Journal Year: 2024, Volume and Issue: 142(4), P. 321 - 321

Published: Feb. 29, 2024

Ophthalmology is reliant on effective interpretation of multimodal imaging to ensure diagnostic accuracy. The new ability ChatGPT-4 (OpenAI) interpret ophthalmic images has not yet been explored.

Language: Английский

Citations

45

Utility of artificial intelligence‐based large language models in ophthalmic care DOI Creative Commons
Sayantan Biswas,

Leon N. Davies,

Amy L. Sheppard

et al.

Ophthalmic and Physiological Optics, Journal Year: 2024, Volume and Issue: 44(3), P. 641 - 671

Published: Feb. 25, 2024

With the introduction of ChatGPT, artificial intelligence (AI)-based large language models (LLMs) are rapidly becoming popular within scientific community. They use natural processing to generate human-like responses queries. However, application LLMs and comparison abilities among different with their human counterparts in ophthalmic care remain under-reported.

Language: Английский

Citations

31

Dr. Google vs. Dr. ChatGPT: Exploring the Use of Artificial Intelligence in Ophthalmology by Comparing the Accuracy, Safety, and Readability of Responses to Frequently Asked Patient Questions Regarding Cataracts and Cataract Surgery DOI
Samuel Cohen, Arthur Brant,

Ann Fisher

et al.

Seminars in Ophthalmology, Journal Year: 2024, Volume and Issue: 39(6), P. 472 - 479

Published: March 22, 2024

Purpose Patients are using online search modalities to learn about their eye health. While Google remains the most popular engine, use of large language models (LLMs) like ChatGPT has increased. Cataract surgery is common surgical procedure in US, and there limited data on quality information that populates after searches related cataract engines such as LLM platforms ChatGPT. We identified patient frequently asked questions (FAQs) cataracts evaluated accuracy, safety, readability answers these provided by both demonstrated utility writing notes creating education materials.

Language: Английский

Citations

23

Systematic Review of Large Language Models for Patient Care: Current Applications and Challenges DOI Creative Commons
Felix Busch, Lena Hoffmann, Christopher Rueger

et al.

medRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: March 5, 2024

Abstract The introduction of large language models (LLMs) into clinical practice promises to improve patient education and empowerment, thereby personalizing medical care broadening access knowledge. Despite the popularity LLMs, there is a significant gap in systematized information on their use care. Therefore, this systematic review aims synthesize current applications limitations LLMs using data-driven convergent synthesis approach. We searched 5 databases for qualitative, quantitative, mixed methods articles published between 2022 2023. From 4,349 initial records, 89 studies across 29 specialties were included, primarily examining based GPT-3.5 (53.2%, n=66 124 different examined per study) GPT-4 (26.6%, n=33/124) architectures question answering, followed by generation, including text summarization or translation, documentation. Our analysis delineates two primary domains LLM limitations: design output. Design included 6 second-order 12 third-order codes, such as lack domain optimization, data transparency, accessibility issues, while output 9 32 example, non-reproducibility, non-comprehensiveness, incorrectness, unsafety, bias. In conclusion, study first systematically map care, providing foundational framework taxonomy implementation evaluation healthcare settings.

Language: Английский

Citations

17

A framework for human evaluation of large language models in healthcare derived from literature review DOI Creative Commons

Thomas Yu Chow Tam,

Sonish Sivarajkumar,

Sumit Kapoor

et al.

npj Digital Medicine, Journal Year: 2024, Volume and Issue: 7(1)

Published: Sept. 28, 2024

Abstract With generative artificial intelligence (GenAI), particularly large language models (LLMs), continuing to make inroads in healthcare, assessing LLMs with human evaluations is essential assuring safety and effectiveness. This study reviews existing literature on evaluation methodologies for healthcare across various medical specialties addresses factors such as dimensions, sample types sizes, selection, recruitment of evaluators, frameworks metrics, process, statistical analysis type. Our review 142 studies shows gaps reliability, generalizability, applicability current practices. To overcome significant obstacles LLM developments deployments, we propose QUEST, a comprehensive practical framework covering three phases workflow: Planning, Implementation Adjudication, Scoring Review. QUEST designed five proposed principles: Quality Information, Understanding Reasoning, Expression Style Persona, Safety Harm, Trust Confidence.

Language: Английский

Citations

17

Current applications and challenges in large language models for patient care: a systematic review DOI Creative Commons
Felix Busch, Lena Hoffmann, Christopher Rueger

et al.

Communications Medicine, Journal Year: 2025, Volume and Issue: 5(1)

Published: Jan. 21, 2025

Abstract Background The introduction of large language models (LLMs) into clinical practice promises to improve patient education and empowerment, thereby personalizing medical care broadening access knowledge. Despite the popularity LLMs, there is a significant gap in systematized information on their use care. Therefore, this systematic review aims synthesize current applications limitations LLMs Methods We systematically searched 5 databases for qualitative, quantitative, mixed methods articles published between 2022 2023. From 4349 initial records, 89 studies across 29 specialties were included. Quality assessment was performed using Mixed Appraisal Tool 2018. A data-driven convergent synthesis approach applied thematic syntheses LLM free line-by-line coding Dedoose. Results show that most investigate Generative Pre-trained Transformers (GPT)-3.5 (53.2%, n = 66 124 different examined) GPT-4 (26.6%, 33/124) answering questions, followed by generation, including text summarization or translation, documentation. Our analysis delineates two primary domains limitations: design output. Design include 6 second-order 12 third-order codes, such as lack domain optimization, data transparency, accessibility issues, while output 9 32 example, non-reproducibility, non-comprehensiveness, incorrectness, unsafety, bias. Conclusions This maps care, providing foundational framework taxonomy implementation evaluation healthcare settings.

Language: Английский

Citations

6

Large Language Models for Chatbot Health Advice Studies DOI Creative Commons
Bright Huo,

Amy Boyle,

Nana Marfo

et al.

JAMA Network Open, Journal Year: 2025, Volume and Issue: 8(2), P. e2457879 - e2457879

Published: Feb. 4, 2025

Importance There is much interest in the clinical integration of large language models (LLMs) health care. Many studies have assessed ability LLMs to provide advice, but quality their reporting uncertain. Objective To perform a systematic review examine variability among peer-reviewed evaluating performance generative artificial intelligence (AI)–driven chatbots for summarizing evidence and providing advice inform development Chatbot Assessment Reporting Tool (CHART). Evidence Review A search MEDLINE via Ovid, Embase Elsevier, Web Science from inception October 27, 2023, was conducted with help sciences librarian yield 7752 articles. Two reviewers screened articles by title abstract followed full-text identify primary accuracy AI-driven (chatbot studies). then performed data extraction 137 eligible studies. Findings total were included. Studies examined topics surgery (55 [40.1%]), medicine (51 [37.2%]), care (13 [9.5%]). focused on treatment (91 [66.4%]), diagnosis (60 [43.8%]), or disease prevention (29 [21.2%]). Most (136 [99.3%]) evaluated inaccessible, closed-source did not enough information version LLM under evaluation. All lacked sufficient description characteristics, including temperature, token length, fine-tuning availability, layers, other details. describe prompt engineering phase study. The date querying reported 54 (39.4%) (89 [65.0%]) used subjective means define successful chatbot, while less than one-third addressed ethical, regulatory, patient safety implications LLMs. Conclusions Relevance In this chatbot studies, heterogeneous may CHART standards. Ethical, considerations are crucial as grows

Language: Английский

Citations

5

Evaluating Artificial Intelligence in Spinal Cord Injury Management: A Comparative Analysis of ChatGPT-4o and Google Gemini Against American College of Surgeons Best Practices Guidelines for Spine Injury DOI Creative Commons
Alexander Yu, Anna Fen‐Yau Li, Wasil Ahmed

et al.

Global Spine Journal, Journal Year: 2025, Volume and Issue: unknown

Published: Feb. 17, 2025

Study Design Comparative Analysis. Objectives The American College of Surgeons developed the 2022 Best Practice Guidelines to provide evidence-based recommendations for managing spinal injuries. This study aims assess concordance ChatGPT-4o and Gemini Advanced with ACS Guidelines, offering first expert evaluation these models in cord Methods Trauma Quality Program Practices Spine Injury were used create 52 questions based on key clinical recommendations. These grouped into informational (8), diagnostic (14), treatment (30) categories posed Google Advanced. Responses graded guidelines validated by a board-certified spine surgeon. Results ChatGPT was concordant 38 (73.07%) 36 (69.23%). Most non-concordant answers due insufficient information. disagreed 8 questions, 5 3. Both achieved 75% information; outperformed diagnostics (78.57% vs 71.43%), while had higher (73.33% 63.33%). Conclusions demonstrate potential as valuable assets injury management providing responses aligned current best practices. marginal differences rates suggest that neither model exhibits superior ability deliver guidelines. Despite LLMs increasing sophistication utility, existing limitations currently prevent them from being clinically safe practical trauma-based settings.

Language: Английский

Citations

2