Large-Scale Deep Learning for Metastasis Detection in Pathology Reports DOI Creative Commons
Patrycja Krawczuk, Zachary Fox, Valentina I. Petkov

et al.

medRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: Dec. 13, 2024

Abstract No existing algorithm can reliably identify metastasis from pathology reports across multiple cancer types and the entire US population. In this study, we develop a deep learning model that automatically detects patients with metastatic by using many laboratories of types. We trained validated our on cohort 29,632 four Surveillance, Epidemiology, End Results (SEER) registries linked to 60,471 unstructured reports. Our architecture task-specific data outperforms general-purpose LLM, recall 0.894 compared 0.824. quantified uncertainty used it defer for human review. found retaining 72.9% increased 0.969. This approach could streamline population-based surveillance help address unmet need capture recurrence or progression.

Language: Английский

Can Large Language Models Aid Caregivers of Pediatric Cancer Patients in Information Seeking? A Cross‐Sectional Investigation DOI Creative Commons
Emre Sezgın, D Jackson, A. Baki Kocaballı

et al.

Cancer Medicine, Journal Year: 2025, Volume and Issue: 14(1)

Published: Jan. 1, 2025

ABSTRACT Purpose Caregivers in pediatric oncology need accurate and understandable information about their child's condition, treatment, side effects. This study assesses the performance of publicly accessible large language model (LLM)‐supported tools providing valuable reliable to caregivers children with cancer. Methods In this cross‐sectional study, we evaluated four LLM‐supported tools—ChatGPT (GPT‐4), Google Bard (Gemini Pro), Microsoft Bing Chat, SGE—against a set frequently asked questions (FAQs) derived from Children's Oncology Group Family Handbook expert input (In total, 26 FAQs 104 generated responses). Five experts assessed LLM responses using measures including accuracy, clarity, inclusivity, completeness, clinical utility, overall rating. Additionally, content quality was readability, AI disclosure, source credibility, resource matching, originality. We used descriptive analysis statistical tests Shapiro–Wilk, Levene's, Kruskal–Wallis H ‐tests, Dunn's post hoc for pairwise comparisons. Results ChatGPT shows high when by experts. also performed well, especially accuracy clarity responses, whereas Chat SGE had lower scores. Regarding disclosure being AI, it observed less which may have affected maintained balance between response clarity. most readable answered complexity. varied significantly ( p < 0.001) across all evaluations except inclusivity. Through our thematic free‐text comments, emotional tone empathy emerged as unique theme mixed feedback on expectations be empathetic. Conclusion can enhance caregivers' knowledge oncology. Each has strengths areas improvement, indicating careful selection based specific contexts. Further research is required explore application other medical specialties patient demographics, assessing broader applicability long‐term impacts.

Language: Английский

Citations

3

Natural Language Processing in medicine and ophthalmology: A review for the 21st-century clinician DOI Creative Commons
William Rojas‐Carabali,

Rajdeep Agrawal,

Laura Gutiérrez-Sinisterra

et al.

Asia-Pacific Journal of Ophthalmology, Journal Year: 2024, Volume and Issue: 13(4), P. 100084 - 100084

Published: July 1, 2024

Natural Language Processing (NLP) is a subfield of artificial intelligence that focuses on the interaction between computers and human language, enabling to understand, generate, derive meaning from language. NLP's potential applications in medical field are extensive vary extracting data Electronic Health Records -one its most well-known frequently exploited uses- investigating relationships among genetics, biomarkers, drugs, diseases for proposal new medications. NLP can be useful clinical decision support, patient monitoring, or image analysis. Despite vast potential, real-world application still limited due various challenges constraints, evolution predominantly continues within research domain. However, with increasingly widespread use NLP, particularly availability large language models, such as ChatGPT, it crucial professionals aware status, uses, limitations these technologies.

Language: Английский

Citations

5

The Potential Impact of Large Language Models on Doctor–Patient Communication: A Case Study in Prostate Cancer DOI Open Access
Marius Geantă,

Daniel Bădescu,

Narcis Chirca

et al.

Healthcare, Journal Year: 2024, Volume and Issue: 12(15), P. 1548 - 1548

Published: Aug. 5, 2024

Background: In recent years, the integration of large language models (LLMs) into healthcare has emerged as a revolutionary approach to enhancing doctor–patient communication, particularly in management diseases such prostate cancer. Methods: Our paper evaluated effectiveness three prominent LLMs—ChatGPT (3.5), Gemini (Pro), and Co-Pilot (the free version)—against official Romanian Patient’s Guide on Employing randomized blinded method, our study engaged eight medical professionals assess responses these based accuracy, timeliness, comprehensiveness, user-friendliness. Results: The primary objective was explore whether LLMs, when operating Romanian, offer comparable or superior performance Guide, considering their potential personalize communication enhance informational accessibility for patients. Results indicated that ChatGPT, generally provided more accurate user-friendly information compared Guide. Conclusions: findings suggest significant LLMs by providing accessible information. However, variability across different underscores need tailored implementation strategies. We highlight importance integrating with nuanced understanding capabilities limitations optimize use clinical settings.

Language: Английский

Citations

5

Explaining decisions without explainability? Artificial intelligence and medicolegal accountability DOI Creative Commons
Melissa D. McCradden, Ian Stedman

Future Healthcare Journal, Journal Year: 2024, Volume and Issue: 11(3), P. 100171 - 100171

Published: Sept. 1, 2024

Image, graphical abstract.

Language: Английский

Citations

5

Large Language Model Prompting Techniques for Advancement in Clinical Medicine DOI Open Access

Krish Shah,

Andrew Xu, Yatharth Sharma

et al.

Journal of Clinical Medicine, Journal Year: 2024, Volume and Issue: 13(17), P. 5101 - 5101

Published: Aug. 28, 2024

Large Language Models (LLMs have the potential to revolutionize clinical medicine by enhancing healthcare access, diagnosis, surgical planning, and education. However, their utilization requires careful, prompt engineering mitigate challenges like hallucinations biases. Proper of LLMs involves understanding foundational concepts such as tokenization, embeddings, attention mechanisms, alongside strategic prompting techniques ensure accurate outputs. For innovative solutions, it is essential maintain ongoing collaboration between AI technology medical professionals. Ethical considerations, including data security bias mitigation, are critical application. By leveraging supplementary resources in research education, we can enhance learning support knowledge-based inquiries, ultimately advancing quality accessibility care. Continued development necessary fully realize transforming healthcare.

Language: Английский

Citations

4

Evaluating large language models as patient education tools for inflammatory bowel disease: A comparative study DOI
Yan Zhang,

Xiao-Han Wan,

Qingzhou Kong

et al.

World Journal of Gastroenterology, Journal Year: 2025, Volume and Issue: 31(6)

Published: Jan. 10, 2025

Inflammatory bowel disease (IBD) is a global health burden that affects millions of individuals worldwide, necessitating extensive patient education. Large language models (LLMs) hold promise for addressing information needs. However, LLM use to deliver accurate and comprehensible IBD-related medical has yet be thoroughly investigated. To assess the utility three LLMs (ChatGPT-4.0, Claude-3-Opus, Gemini-1.5-Pro) as reference point patients with IBD. In this comparative study, two gastroenterology experts generated 15 questions reflected common concerns. These were used evaluate performance LLMs. The answers provided by each model independently assessed using Likert scale focusing on accuracy, comprehensibility, correlation. Simultaneously, invited comprehensibility their answers. Finally, readability assessment was performed. Overall, achieved satisfactory levels completeness when answering questions, although varies. All investigated demonstrated strengths in providing basic such IBD definition well its symptoms diagnostic methods. Nevertheless, dealing more complex advice, medication side effects, dietary adjustments, complication risks, quality inconsistent between Notably, Claude-3-Opus better than other models. have potential educational tools IBD; however, there are discrepancies Further optimization development specialized necessary ensure accuracy safety provided.

Language: Английский

Citations

0

Evaluating the Performance of ChatGPT-4o Oncology Expert in Comparison to Standard Medical Oncology Knowledge: A Focus on Treatment-Related Clinical Questions DOI Open Access
Oğuzcan Kınıkoğlu, Deniz Işık

Cureus, Journal Year: 2025, Volume and Issue: unknown

Published: Jan. 27, 2025

Integrating artificial intelligence (AI) into oncology can revolutionize decision-making by providing accurate information. This study evaluates the performance of ChatGPT-4o (OpenAI, San Francisco, CA) Oncology Expert, in addressing open-ended clinical questions. Thirty-seven treatment-related questions on solid organ tumors were selected from a hematology-oncology textbook. Responses Expert and textbook anonymized independently evaluated two medical oncologists using structured scoring system focused accuracy justification. Statistical analysis, including paired t-tests, was conducted to compare scores, interrater reliability assessed Cohen's Kappa. achieved significantly higher average score 7.83 compared textbook's 7.0 (p < 0.01). In 10 cases, provided more updated answers, demonstrating its ability integrate recent knowledge. 26 both sources equally relevant but Expert's responses clearer easier understand. Kappa indicated almost perfect agreement (κ = 0.93). Both included outdated information for bladder cancer treatment, underscoring need regular updates. shows significant potential as tool offering precise, up-to-date, user-friendly responses. It could transform practice enhancing efficiency, improving educational tools, serving reliable adjunct workflows. However, integration requires updates, expert validation, collaborative approach ensure relevance rapidly evolving field oncology.

Language: Английский

Citations

0

Enhancing healthcare resource allocation through large language models DOI
Fang Wan, Kezhi Wang, Tao Wang

et al.

Swarm and Evolutionary Computation, Journal Year: 2025, Volume and Issue: 94, P. 101859 - 101859

Published: Feb. 5, 2025

Language: Английский

Citations

0

Comparative evaluation and performance of large language models on expert level critical care questions: a benchmark study DOI Creative Commons
Jessica D. Workum,

Bas W. S. Volkers,

Davy van de Sande

et al.

Critical Care, Journal Year: 2025, Volume and Issue: 29(1)

Published: Feb. 10, 2025

Abstract Background Large language models (LLMs) show increasing potential for their use in healthcare administrative support and clinical decision making. However, reports on performance critical care medicine is lacking. Methods This study evaluated five LLMs (GPT-4o, GPT-4o-mini, GPT-3.5-turbo, Mistral 2407 Llama 3.1 70B) 1181 multiple choice questions (MCQs) from the gotheextramile.com database, a comprehensive database of at European Diploma Intensive Care examination level. Their was compared to random guessing 350 human physicians 77-MCQ practice test. Metrics included accuracy, consistency, domain-specific performance. Costs, as proxy energy consumption, were also analyzed. Results GPT-4o achieved highest accuracy 93.3%, followed by 70B (87.5%), (87.9%), GPT-4o-mini (83.0%), GPT-3.5-turbo (72.7%). Random yielded 41.5% ( p < 0.001). On test, all surpassed physicians, scoring 89.0%, 80.9%, 84.4%, 80.3%, 66.5%, respectively, 42.7% 0.001) 61.9% physicians. contrast other 0.001), GPT-3.5-turbo’s did not significantly outperform = 0.196). Despite high overall gave consistently incorrect answers. The most expensive model GPT-4o, costing over 25 times more than least model, GPT-4o-mini. Conclusions exhibit exceptional with four outperforming European-level exam. led but raised concerns about consumption. care, produced answers, highlighting need thorough ongoing evaluations guide responsible implementation settings.

Language: Английский

Citations

0

Using mathematical modelling and AI to improve delivery and efficacy of therapies in cancer DOI
Constantinos Harkos, Andreas G. Hadjigeorgiou, Chrysovalantis Voutouri

et al.

Nature reviews. Cancer, Journal Year: 2025, Volume and Issue: unknown

Published: Feb. 19, 2025

Language: Английский

Citations

0