Published: April 24, 2025
Language: Английский
Published: April 24, 2025
Language: Английский
The Lancet Digital Health, Journal Year: 2024, Volume and Issue: 6(9), P. e662 - e672
Published: Aug. 23, 2024
Among the rapid integration of artificial intelligence in clinical settings, large language models (LLMs), such as Generative Pre-trained Transformer-4, have emerged multifaceted tools that potential for health-care delivery, diagnosis, and patient care. However, deployment LLMs raises substantial regulatory safety concerns. Due to their high output variability, poor inherent explainability, risk so-called AI hallucinations, LLM-based applications serve a medical purpose face challenges approval devices under US EU laws, including recently passed Artificial Intelligence Act. Despite unaddressed risks patients, misdiagnosis unverified advice, are available on market. The ambiguity surrounding these creates an urgent need frameworks accommodate unique capabilities limitations. Alongside development frameworks, existing regulations should be enforced. If regulators fear enforcing market dominated by supply or technology companies, consequences layperson harm will force belated action, damaging potentiality advice.
Language: Английский
Citations
26Nature Medicine, Journal Year: 2025, Volume and Issue: unknown
Published: Jan. 8, 2025
Language: Английский
Citations
13Nature Medicine, Journal Year: 2025, Volume and Issue: unknown
Published: Feb. 5, 2025
Language: Английский
Citations
6BMJ Quality & Safety, Journal Year: 2025, Volume and Issue: unknown, P. bmjqs - 017918
Published: Jan. 3, 2025
Generative artificial intelligence (AI) technologies have the potential to revolutionise healthcare delivery but require classification and monitoring of patient safety risks. To address this need, we developed evaluated a preliminary system for categorising generative AI errors. Our is organised around two stages (input output) with specific error types by stage. We applied our applications assess its effectiveness in issues: patient-facing conversational large language models (LLMs) an ambient digital scribe (ADS) clinical documentation. In LLM analysis, identified 45 errors across 27 medical queries, omission being most common (42% errors). Of errors, 50% were categorised as low significance, 25% moderate significance high significance. Similarly, ADS simulation, 66 11 visits, (83% 55% 45% These findings demonstrate system’s utility output from different applications, providing starting point developing robust process better understand AI-enabled
Language: Английский
Citations
3Cancer Medicine, Journal Year: 2025, Volume and Issue: 14(1)
Published: Jan. 1, 2025
ABSTRACT Purpose Caregivers in pediatric oncology need accurate and understandable information about their child's condition, treatment, side effects. This study assesses the performance of publicly accessible large language model (LLM)‐supported tools providing valuable reliable to caregivers children with cancer. Methods In this cross‐sectional study, we evaluated four LLM‐supported tools—ChatGPT (GPT‐4), Google Bard (Gemini Pro), Microsoft Bing Chat, SGE—against a set frequently asked questions (FAQs) derived from Children's Oncology Group Family Handbook expert input (In total, 26 FAQs 104 generated responses). Five experts assessed LLM responses using measures including accuracy, clarity, inclusivity, completeness, clinical utility, overall rating. Additionally, content quality was readability, AI disclosure, source credibility, resource matching, originality. We used descriptive analysis statistical tests Shapiro–Wilk, Levene's, Kruskal–Wallis H ‐tests, Dunn's post hoc for pairwise comparisons. Results ChatGPT shows high when by experts. also performed well, especially accuracy clarity responses, whereas Chat SGE had lower scores. Regarding disclosure being AI, it observed less which may have affected maintained balance between response clarity. most readable answered complexity. varied significantly ( p < 0.001) across all evaluations except inclusivity. Through our thematic free‐text comments, emotional tone empathy emerged as unique theme mixed feedback on expectations be empathetic. Conclusion can enhance caregivers' knowledge oncology. Each has strengths areas improvement, indicating careful selection based specific contexts. Further research is required explore application other medical specialties patient demographics, assessing broader applicability long‐term impacts.
Language: Английский
Citations
3JAMA Network Open, Journal Year: 2024, Volume and Issue: 7(7), P. e2422399 - e2422399
Published: July 16, 2024
Importance Virtual patient-physician communications have increased since 2020 and negatively impacted primary care physician (PCP) well-being. Generative artificial intelligence (GenAI) drafts of patient messages could potentially reduce health professional (HCP) workload improve communication quality, but only if the are considered useful. Objectives To assess PCPs’ perceptions GenAI to examine linguistic characteristics associated with equity perceived empathy. Design, Setting, Participants This cross-sectional quality improvement study tested hypothesis that ratings (created using electronic record [EHR] standard prompts) would be equivalent HCP-generated responses on 3 dimensions. The was conducted at NYU Langone Health private patient-HCP internal medicine practices piloting GenAI. Exposures Randomly assigned coupled either an HCP message or draft response. Main Outcomes Measures PCPs rated responses’ information content (eg, relevance), a Likert scale, verbosity), whether they use start anew (usable vs unusable). Branching logic further probed for empathy, personalization, professionalism responses. Computational linguistics methods assessed differences in responses, focusing Results A total 16 (8 [50.0%] female) reviewed 344 (175 drafted; 169 drafted). Both were favorably. higher style than (mean [SD], 3.70 [1.15] 3.38 [1.20]; P = .01, U 12 568.5) similar HCPs 3.53 [1.26] 3.41 [1.27]; .37; 13 981.0) usable proportion 0.69 [0.48] 0.65 [0.47], .49, t −0.6842). Usable more empathetic (32 86 [37.2%] 79 [16.5%]; difference, 125.5%), possibly attributable subjective 0.54 [0.16] 0.31 [0.23]; &lt; .001; 74.2%) positive [SD] polarity, 0.21 [0.14] 0.13 [0.25]; .02; 61.5%) language; also numerically longer word count, 90.5 [32.0] 65.4 [62.6]; 38.4%), difference not statistically significant ( .07) linguistically complex score, 125.2 [47.8] 95.4 [58.8]; .002; 31.2%). Conclusions In this PCP EHR-integrated chatbot, found communicate better empathy HCPs, highlighting its potential enhance communication. However, less readable HCPs’, concern patients low English literacy.
Language: Английский
Citations
15NEJM AI, Journal Year: 2024, Volume and Issue: 1(8)
Published: July 10, 2024
Language: Английский
Citations
14medRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown
Published: July 25, 2024
Large Language Models (LLMs) are rapidly being adopted in healthcare, necessitating standardized reporting guidelines. We present TRIPOD-LLM, an extension of the TRIPOD+AI statement, addressing unique challenges LLMs biomedical applications. TRIPOD-LLM provides a comprehensive checklist 19 main items and 50 subitems, covering key aspects from title to discussion. The guidelines introduce modular format accommodating various LLM research designs tasks, with 14 32 subitems applicable across all categories. Developed through expedited Delphi process expert consensus, emphasizes transparency, human oversight, task-specific performance reporting. also interactive website ( https://tripod-llm.vercel.app/ ) facilitating easy guideline completion PDF generation for submission. As living document, will evolve field, aiming enhance quality, reproducibility, clinical applicability healthcare
Language: Английский
Citations
10Frontiers in Artificial Intelligence, Journal Year: 2025, Volume and Issue: 8
Published: Jan. 27, 2025
Large Language Models (LLMs) offer considerable potential to enhance various aspects of healthcare, from aiding with administrative tasks clinical decision support. However, despite the growing use LLMs in a critical gap persists clear, actionable guidelines available healthcare organizations and providers ensure their responsible safe implementation. In this paper, we propose practical step-by-step approach bridge support warranting implementation into healthcare. The recommendations manuscript include protecting patient privacy, adapting models healthcare-specific needs, adjusting hyperparameters appropriately, ensuring proper medical prompt engineering, distinguishing between (CDS) non-CDS applications, systematically evaluating LLM outputs using structured approach, implementing solid model governance structure. We furthermore ACUTE mnemonic; for assessing responses based on Accuracy, Consistency, semantically Unaltered outputs, Traceability, Ethical considerations. Together, these aim provide clear pathway practice.
Language: Английский
Citations
1JAMA Network Open, Journal Year: 2025, Volume and Issue: 8(3), P. e250462 - e250462
Published: March 11, 2025
Joanna S. Cavalier, MD; Benjamin A. Goldstein, PhD; Vardit Ravitsky, Jean-Christophe Bélisle-Pipon, Armando Bedoya, MD, MMCi; Jennifer Maddocks, PT, Sam Klotman, MPH; Matthew Roman, MHA, Jessica Sperling, Chun Xu, MB; Eric G. Poon, Anand Chowdhury, MMCi
Language: Английский
Citations
1