Generative AI and Large Language Models in Reducing Medication Related Harm and Adverse Drug Events – A Scoping Review DOI Creative Commons
Jasmine Chiat Ling Ong,

Chen Michael,

Ning Ng

et al.

medRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: Sept. 14, 2024

Abstract Background Medication-related harm has a significant impact on global healthcare costs and patient outcomes, accounting for deaths in 4.3 per 1000 patients. Generative artificial intelligence (GenAI) emerged as promising tool mitigating risks of medication-related harm. In particular, large language models (LLMs) well-developed generative adversarial networks (GANs) showing promise related tasks. This review aims to explore the scope effectiveness AI reducing harm, identifying existing development challenges research. Methods We searched peer reviewed articles PubMed, Web Science, Embase, Scopus literature published from January 2012 February 2024. included studies focusing or application risk during entire medication use process. excluded using traditional methods only, those unrelated settings, concerning non-prescribed uses such supplements. Extracted variables study characteristics, model specifics performance, any outcome evaluated. Findings A total 2203 were identified, 14 met criteria inclusion into final review. found that used few key applications: drug-drug interaction identification prediction; clinical decision support pharmacovigilance. While performance utility these varied, they generally showed areas like early classification adverse drug events decision-making management. However, no tested prospectively, suggesting need further investigation integration real-world tools improve safety outcomes effectively. Interpretation shows harms, but there are gaps research rigor ethical considerations. Future should focus creation high-quality, task-specific benchmarking datasets implementation outcomes.

Language: Английский

Artificial Intelligence-Powered Hand Surgery Consultation: GPT-4 as an Assistant in a Hand Surgery Outpatient Clinic DOI Creative Commons
Tim Leypold, Benedikt Schäfer, Anja M. Boos

et al.

The Journal Of Hand Surgery, Journal Year: 2024, Volume and Issue: 49(11), P. 1078 - 1088

Published: July 27, 2024

Language: Английский

Citations

5

Generative Large Language Models in Electronic Health Records for Patient Care Since 2023: A Systematic Review DOI Creative Commons
Xinsong Du, Yifei Wang, Zhengyang Zhou

et al.

medRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: Aug. 12, 2024

Background: Generative Large language models (LLMs) represent a significant advancement in natural processing, achieving state-of-the-art performance across various tasks. However, their application clinical settings using real electronic health records (EHRs) is still rare and presents numerous challenges. Objective: This study aims to systematically review the use of generative LLMs, effectiveness relevant techniques patient care-related topics involving EHRs, summarize challenges faced, suggest future directions. Methods: A Boolean search for peer-reviewed articles was conducted on May 19th, 2024 PubMed Web Science include research published since 2023, which one month after release ChatGPT. The results were deduplicated. Multiple reviewers, including biomedical informaticians, computer scientists, physician, screened publications eligibility data extraction. Only studies utilizing LLMs analyze EHR included. We summarized prompt engineering, fine-tuning, multimodal data, evaluation matrices. Additionally, we identified current applying as reported by included proposed Results: initial 6,328 unique studies, with 76 screening. Of these, 67 (88.2%) employed zero-shot prompting, five them 100% accuracy specific Nine used advanced prompting strategies; four tested these strategies experimentally, finding that engineering improved performance, noting non-linear relationship between number examples improvement. Eight explored fine-tuning all improvements tasks, but three noted potential degradation certain two utilized LLM-based decision-making enabled accurate disease diagnosis prognosis. 55 different metrics 22 purposes, such correctness, completeness, conciseness. Two investigated LLM bias, detecting no bias other male patients received more appropriate suggestions. Six hallucinations, fabricating names structured thyroid ultrasound reports. Additional not limited impersonal tone consultations, made uncomfortable, difficulty had understanding responses. Conclusion: Our indicates few have computational enhance performance. diverse highlight need standardization. currently cannot replace physicians due

Language: Английский

Citations

4

Evaluating accuracy and reproducibility of large language model performance on critical care assessments in pharmacy education DOI Creative Commons

Huibo Yang,

Mengxuan Hu, Amoreena Most

et al.

Frontiers in Artificial Intelligence, Journal Year: 2025, Volume and Issue: 7

Published: Jan. 9, 2025

Large language models (LLMs) have demonstrated impressive performance on medical licensing and diagnosis-related exams. However, comparative evaluations to optimize LLM ability in the domain of comprehensive medication management (CMM) are lacking. The purpose this evaluation was test various LLMs optimization strategies critical care pharmacotherapy questions used assessment Doctor Pharmacy students. In a analysis using 219 multiple-choice questions, five (GPT-3.5, GPT-4, Claude 2, Llama2-7b 2-13b) were evaluated. Each queried times evaluate primary outcome accuracy (i.e., correctness). Secondary outcomes included variance, impact prompt engineering techniques (e.g., chain-of-thought, CoT) training customized GPT performance, comparison third year doctor pharmacy students knowledge recall vs. application questions. Accuracy variance compared with student's t-test compare under different model settings. ChatGPT-4 exhibited highest (71.6%), while Llama2-13b had lowest (0.070). All performed more accurately ChatGPT-4: 87% 67%). When applied ChatGPT-4, few-shot CoT across runs improved (77.4% 71.5%) no effect variance. Self-consistency custom-trained similar CoT. Overall student 81%, an optimal overall 73%. Comparing question types, six equivalent or higher than self-consistency students: 93% 84%), but achieved all 68% 80%). most accurate most. Average overall, These findings support need for future type output needed. Reliance is only supported recall-based

Language: Английский

Citations

0

Evaluating ChatGPT o1’s Capabilities in Peripheral Nerve Surgery: Advancing Artificial Intelligence in Clinical Practice DOI Creative Commons
Tim Leypold,

Jörg Bahm,

Justus P. Beier

et al.

World Neurosurgery, Journal Year: 2025, Volume and Issue: 196, P. 123753 - 123753

Published: March 6, 2025

Artificial intelligence (AI) continues to advance in healthcare, offering innovative approaches enhance clinical decision-making and patient management. Peripheral nerve surgery poses unique challenges due the complexity of cases need for precise diagnostic therapeutic strategies. This study investigates application OpenAI's generative AI model, o1, assisting with intricate processes peripheral surgery. Utilizing advanced prompt engineering techniques, o1 was configured as a virtual medical assistant (GPT-NS) process five simulated scenarios modeled after real-world cases. The guided surgeons through history, diagnostics, treatment planning, culminating case summaries. A panel specialists residents evaluated AI's performance using Likert scale across seven criteria. GPT-NS demonstrated strong capabilities, achieving an average score 4.3. High ratings were observed understanding issues presentation clarity. However, areas improvement noted sequencing recommendations. Despite lower indicating human evaluators' perception their superiority over handling cases, showed promise supportive tool practice. As LLM (Large Language Model) improve, it is becoming increasingly important that absolute experts assess accuracy answers ensure reliable clinically sound integration into healthcare practices. underscores potential augmenting highly specialized fields like while demonstrating ongoing importance expertise. Future research should explore ways further refine capabilities its routine surgical workflows.

Language: Английский

Citations

0

A comprehensive review of neurotransmitter modulation via artificial intelligence: A new frontier in personalized neurobiochemistry DOI

Jaleh Bagheri Hamzyan Olia,

Arasu Raman, Chou‐Yi Hsu

et al.

Computers in Biology and Medicine, Journal Year: 2025, Volume and Issue: 189, P. 109984 - 109984

Published: March 14, 2025

Language: Английский

Citations

0

A scoping review on generative AI and large language models in mitigating medication related harm DOI Creative Commons
Jasmine Chiat Ling Ong, Michael Hao Chen,

Ning Ng

et al.

npj Digital Medicine, Journal Year: 2025, Volume and Issue: 8(1)

Published: March 28, 2025

Abstract Medication-related harm has a significant impact on global healthcare costs and patient outcomes. Generative artificial intelligence (GenAI) large language models (LLM) have emerged as promising tool in mitigating risks of medication-related harm. This review evaluates the scope effectiveness GenAI LLM reducing We screened 4 databases for literature published from 1st January 2012 to 15th October 2024. A total 3988 articles were identified, 30 met criteria inclusion into final review. AI LLMs applied three key applications: drug-drug interaction identification prediction, clinical decision support, pharmacovigilance. While performance utility these varied, they generally showed promise early identification, classification adverse drug events, supporting decision-making medication management. However, no studies tested prospectively, suggesting need further investigation integration real-world application.

Language: Английский

Citations

0

Integrating AI in Lipedema Management: Assessing the Efficacy of GPT-4 as a Consultation Assistant DOI Creative Commons
Tim Leypold, Lara F. Lingens, Justus P. Beier

et al.

Life, Journal Year: 2024, Volume and Issue: 14(5), P. 646 - 646

Published: May 20, 2024

The role of artificial intelligence (AI) in healthcare is evolving, offering promising avenues for enhancing clinical decision making and patient management. Limited knowledge about lipedema often leads to patients being frequently misdiagnosed with conditions like lymphedema or obesity rather than correctly identifying lipedema. Furthermore, present intricate extensive medical histories, resulting significant time consumption during consultations. AI could, therefore, improve the management these patients. This research investigates utilization OpenAI’s Generative Pre-Trained Transformer 4 (GPT-4), a sophisticated large language model (LLM), as an assistant consultations Six simulated scenarios were designed mirror typical commonly encountered clinic. GPT-4 was tasked conducting interviews gather presenting its findings, preliminary diagnoses, recommending further diagnostic therapeutic actions. Advanced prompt engineering techniques employed refine efficacy, relevance, accuracy GPT-4’s responses. A panel experts treatment, using Likert Scale, evaluated responses across six key criteria. Scoring ranged from 1 (lowest) 5 (highest), achieving average score 4.24, indicating good reliability applicability setting. study one initial forays into applying models specific scenarios, such It demonstrates potential supporting practices emphasizes continuing importance human expertise field, despite ongoing technological advancements.

Language: Английский

Citations

3

Qualitative metrics from the biomedical literature for evaluating large language models in clinical decision-making: a narrative review DOI Creative Commons

Cindy Ho,

Tiffany Tian,

Alessandra T. Ayers

et al.

BMC Medical Informatics and Decision Making, Journal Year: 2024, Volume and Issue: 24(1)

Published: Nov. 26, 2024

The large language models (LLMs), most notably ChatGPT, released since November 30, 2022, have prompted shifting attention to their use in medicine, particularly for supporting clinical decision-making. However, there is little consensus the medical community on how LLM performance contexts should be evaluated. We performed a literature review of PubMed identify publications between December 1, and April 2024, that discussed assessments LLM-generated diagnoses or treatment plans. selected 108 relevant articles from analysis. frequently used LLMs were GPT-3.5, GPT-4, Bard, LLaMa/Alpaca-based models, Bing Chat. five criteria scoring outputs "accuracy", "completeness", "appropriateness", "insight", "consistency". defining high-quality been consistently by researchers over past 1.5 years. identified high degree variation studies reported findings assessed performance. Standardized reporting qualitative evaluation metrics assess quality can developed facilitate research healthcare.

Language: Английский

Citations

3

The future is now, old man DOI Open Access
Marko Lucijanić, Robert Likić

British Journal of Clinical Pharmacology, Journal Year: 2024, Volume and Issue: 90(3), P. 618 - 619

Published: Feb. 5, 2024

The proper use of medicines—ensuring the right medicine is used at time and in way—is a fundamental principle for clinical pharmacologists, physicians, nurses other healthcare professionals globally. Despite this being an apparent rule, real-life practice often reveals that errors medical service provision are more common than exceptions. There multiple stages treatment where can lead to adverse outcomes. A recent large retrospective cohort study among hospitalized adults transferred intensive care unit who died reported 23% patients experienced diagnostic error, 17.8% these cases, contributed temporary harm, permanent harm or death.1 Previously, were considered third leading cause death, following cancer heart disease.2 However, claim itself turned out be erroneous,3 highlighting irony fragility our understanding medicine. In context, emergence artificial intelligence (AI) offers promising avenue enhance pharmacology. AI provides potential safeguard against inadvertent medication errors, surpassing what human education, professionalism continuous attention prescribing achieve alone. This inspired us organize special 'holidAI' themed issue, focusing on exciting intersection machine learning (ML), pharmacotherapy. We have received numerous high-quality papers demonstrating strengths weaknesses ML applied pharmacology, along with future areas their application. Delving into specific applications, Rubinic et al4 present thought-provoking concept thematic issue. They explore vulnerability language models (LLMs) misuse bioweapon development. Their includes literature review, examination regulatory documents concerning ethical case illustrating manipulation creating harmful substances. authors conclude current landscape ill-equipped address challenges posed by LLMs suggest dual role LLMs: not just as risks but also tools developing countermeasures novel hazardous History teaches such threats real overlooked due lack contemporary education topic.5 Ryan al6 highlight necessity pharmacologists understand its implementation practice. model development issues surrounding evaluation deployment. Bakkum al7 investigate diverse inclusive vignettes education. Using ChatGPT (OpenAI, GPT 3.5), they generated cases various assignments shared open educational resources, balancing trade-offs approach. findings will further evaluated through scientific research. Exploring integration pharmacy practice, international survey conducted Busch al8 spanned 12 countries revealed predominantly positive attitudes towards undergraduate students. notable finding was students prior coursework felt prepared professional application, underscoring need enhanced within curricula. comparison between providers decision-making process benzodiazepine deprescribing focus Buzancic al.9 found high agreement rate (95%) variations different criteria. identified important limitations AI, including ambiguities inaccuracies, supportive tool rather decision-maker practices. field nephrotoxicity prediction, Noda al10 developed individualized prediction administered tacrolimus. showed improved predictive ability over traditional concentration thresholds, indicating higher accuracy identifying high-risk before initiation. approach predicting preventing complications maxillofacial surgery using algorithm proposed Prazetina al.11 protocol randomized controlled trial. methodology aimed optimize hemodynamic parameters during free flap patients. Furthermore, Pavlov al12 provided analysis outcomes failure treated sodium-glucose co-transporter-2 inhibitors. work contrasts results obtained methods those from algorithms, calling scrutiny critical appraisal findings. Finally, modelling illicit substance abuse patterns age groups undertaken Tummala al.13 presented detailed could inform trial design pharmacometrics disorder treatments future. These studies collectively underscore transformative reshaping As we embrace innovations, must remember haste adopting technologies should astray. Like iconic scene Malcolm Middle, avoid ourselves unprepared unforeseen consequences powerful tools. Dewey's famous line show reminds us, 'The now, old man', already live once distant Both equally writing manuscript. None.

Language: Английский

Citations

2

Spotlight commentary: Integrating artificial intelligence in clinical pharmacology: Opportunities, challenges and ethical imperatives DOI

Karlo Petković,

Zdeslav Strika,

Robert Likić

et al.

British Journal of Clinical Pharmacology, Journal Year: 2024, Volume and Issue: unknown

Published: Sept. 5, 2024

Language: Английский

Citations

1