Large-Scale Deep Learning for Metastasis Detection in Pathology Reports DOI Creative Commons
Patrycja Krawczuk, Zachary Fox, Valentina I. Petkov

et al.

medRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: Dec. 13, 2024

Abstract No existing algorithm can reliably identify metastasis from pathology reports across multiple cancer types and the entire US population. In this study, we develop a deep learning model that automatically detects patients with metastatic by using many laboratories of types. We trained validated our on cohort 29,632 four Surveillance, Epidemiology, End Results (SEER) registries linked to 60,471 unstructured reports. Our architecture task-specific data outperforms general-purpose LLM, recall 0.894 compared 0.824. quantified uncertainty used it defer for human review. found retaining 72.9% increased 0.969. This approach could streamline population-based surveillance help address unmet need capture recurrence or progression.

Language: Английский

Artificial Intelligence in Relation to Accurate Information and Tasks in Gynecologic Oncology and Clinical Medicine—Dunning–Kruger Effects and Ultracrepidarianism DOI Creative Commons
Edward J. Pavlik, Jason Woodward,

Frank Lawton

et al.

Diagnostics, Journal Year: 2025, Volume and Issue: 15(6), P. 735 - 735

Published: March 15, 2025

Publications on the application of artificial intelligence (AI) to many situations, including those in clinical medicine, created 2023–2024 are reviewed here. Because short time frame covered, here, it is not possible conduct exhaustive analysis as would be case meta-analyses or systematic reviews. Consequently, this literature review presents an examination narrative AI’s relation contemporary topics related medicine. The landscape findings here span 254 papers published 2024 topically reporting AI which 83 articles considered present because they contain evidence-based findings. In particular, types cases deal with accuracy initial differential diagnoses, cancer treatment recommendations, board-style exams, and performance various tasks, imaging. Importantly, summaries validation techniques used evaluate presented. This focuses AIs that have a relevancy evidenced by evaluation publications. speaks both what has been promised delivered systems. Readers will able understand when generative may expressing views without having necessary information (ultracrepidarianism) responding if had expert knowledge does not. A lack awareness deliver inadequate confabulated can result incorrect medical decisions inappropriate applications (Dunning–Kruger effect). As result, certain cases, system might underperform provide results greatly overestimate any validity.

Language: Английский

Citations

0

Comparative analysis of a standard (GPT-4o) and reasoning-enhanced (o1 pro) large language model on complex clinical questions from the Japanese orthopaedic board examination DOI
Joe Hasei,

Ryuichi Nakahara,

Koichi Takeuchi

et al.

Journal of Orthopaedic Science, Journal Year: 2025, Volume and Issue: unknown

Published: April 1, 2025

Language: Английский

Citations

0

Accuracy, Consistency, and Contextual Understanding of Large Language Models in Restorative Dentistry and Endodontics DOI Creative Commons

Claire Lafourcade,

Olivia Kérourédan, Benoit Ballester

et al.

Journal of Dentistry, Journal Year: 2025, Volume and Issue: unknown, P. 105764 - 105764

Published: April 1, 2025

This study aimed to evaluate and compare the performance of several large language models (LLMs) in context restorative dentistry endodontics, focusing on their accuracy, consistency, contextual understanding. The dataset was extracted from national educational archives Collège National des Enseignants en Odontologie Conservatrice (CNEOC) includes all chapters reference manual for dental residency applicants. Multiple-choice questions (MCQs) were selected following a review by three independent academic experts. Four LLMs assessed: ChatGPT-3.5, ChatGPT-4 (OpenAI), Claude-3 (Anthropic), Mistral 7B (Mistral AI). Model accuracy determined comparing responses with expert-provided answers. Consistency measured through robustness (the ability provide identical paraphrased questions) repeatability same question). Contextual understanding evaluated based model's categorise correctly infer terms definitions. Additionally, reassessed after providing relevant full course chapter. A total 517 MCQs 539 definitions included. demonstrated significantly higher than 7B, showing greater robustness. Advanced displayed high presenting content, although varied closely related concepts. Supplying generally improved response though inconsistently across topics. Even most advanced LLMs, such as Claude 3, achieve moderate require cautious use due inconsistencies Future studies should focus integrating validated content refining prompt engineering enhance clinical utility LLMs. findings underscore potential context-based prompting endodontics.

Language: Английский

Citations

0

Evaluating Large Language Models in Cardiovascular Antithrombotic Care: Performance, Accuracy, and Implications for Clinical Practice DOI
Pavel Antiperovitch,

I. Liu,

Ahmed T. Mokhtar

et al.

Canadian Journal of Cardiology, Journal Year: 2025, Volume and Issue: unknown

Published: April 1, 2025

Language: Английский

Citations

0

Enhancing Pulmonary Disease Prediction Using Large Language Models with Feature Summarization and Hybrid Retrieval-Augmented Generation: Multicenter Methodological Study based on Radiology Report (Preprint) DOI Creative Commons

Ruiteng Li,

Shuai Mao, Congmin Zhu

et al.

Journal of Medical Internet Research, Journal Year: 2025, Volume and Issue: unknown

Published: Feb. 13, 2025

Language: Английский

Citations

0

Medical accuracy of artificial intelligence chatbots in oncology: a scoping review DOI Creative Commons
David Chen,

Kate Elizabeth Avison,

Saif Addeen Alnassar

et al.

The Oncologist, Journal Year: 2025, Volume and Issue: 30(4)

Published: March 29, 2025

Abstract Background Recent advances in large language models (LLM) have enabled human-like qualities of natural competency. Applied to oncology, LLMs been proposed serve as an information resource and interpret vast amounts data a clinical decision-support tool improve outcomes. Objective This review aims describe the current status medical accuracy oncology-related LLM applications research trends for further areas investigation. Methods A scoping literature search was conducted on Ovid Medline peer-reviewed studies published since 2000. We included primary that evaluated model applied oncology settings. Study characteristics outcomes were extracted landscape LLMs. Results Sixty based inclusion exclusion criteria. The majority health question-answer style examinations (48%), followed by diagnosis (20%) management (17%). number utility fine-tuning prompt-engineering increased over time from 2022 2024. Studies reported advantages accurate resource, reduction clinician workload, improved accessibility readability information, while noting disadvantages such poor reliability, hallucinations, need oversight. Discussion There exists significant interest application with particular focus decision support tool. However, is needed validate these tools external hold-out datasets generalizability across diverse scenarios, underscoring supervision tools.

Language: Английский

Citations

0

Advancements in large language model accuracy for answering physical medicine and rehabilitation board review questions DOI

Jason Bitterman,

Alexander S. D’Angelo,

Alexandra Holachek

et al.

PM&R, Journal Year: 2025, Volume and Issue: unknown

Published: May 2, 2025

Abstract Background There have been significant advances in machine learning and artificial intelligence technology over the past few years, leading to release of large language models (LLMs) such as ChatGPT. are many potential applications for LLMs health care, but it is critical first determine how accurate before putting them into practice. No studies evaluated accuracy precision responding questions related field physical medicine rehabilitation (PM&R). Objective To two OpenAI (GPT‐3.5, released November 2022, GPT‐4o, May 2024) answering PM&R knowledge. Design Cross‐sectional study. Both were tested on same 744 knowledge that covered all aspects (general rehabilitation, stroke, traumatic brain injury, spinal cord musculoskeletal medicine, pain electrodiagnostic pediatric prosthetics orthotics, rheumatology, pharmacology). Each LLM was three times question set assess precision. Setting N/A. Patients Interventions Main Outcome Measure Percentage correctly answered questions. Results For runs 744‐question set, GPT‐3.5 56.3%, 56.5%, 56.9% correctly. GPT‐4o 83.6%, 84%, 84.1% outperformed subcategories Conclusions rapidly advancing, with more recent model performing much better compared GPT‐3.5. augmenting clinical practice, medical training, patient education. However, has limitations physicians should remain cautious using practice at this time.

Language: Английский

Citations

0

Causality Extraction from Medical Text Using Large Language Models (LLMs) DOI Creative Commons
Seethalakshmi Gopalakrishnan, Luciana Garbayo, Wlodek Zadrozny

et al.

Information, Journal Year: 2024, Volume and Issue: 16(1), P. 13 - 13

Published: Dec. 30, 2024

This study explores the potential of natural language models, including large to extract causal relations from medical texts, specifically clinical practice guidelines (CPGs). The outcomes causality extraction for gestational diabetes are presented, marking a first in field. results reported on set experiments using variants BERT (BioBERT, DistilBERT, and BERT) newer models (LLMs), namely, GPT-4 LLAMA2. Our show that BioBERT performed better than other with an average F1-score 0.72. LLAMA2 similar performance but less consistency. code annotated corpus statements within released. Extracting structures might help identify LLMs’ hallucinations possibly prevent some errors if LLMs used patient settings. Some practical extensions extracting text would include providing additional diagnostic support based frequent cause–effect relationships, identifying possible inconsistencies guidelines, evaluating evidence recommendations.

Language: Английский

Citations

1

The Most Disruptive Near-Term Use of AI in Cancer Care: Patient Empowerment Through Software Agents DOI
Frank Austin Nothaft,

Brad Power

AI in Precision Oncology, Journal Year: 2024, Volume and Issue: unknown

Published: Aug. 30, 2024

Language: Английский

Citations

0

Large-Scale Deep Learning for Metastasis Detection in Pathology Reports DOI Creative Commons
Patrycja Krawczuk, Zachary Fox, Valentina I. Petkov

et al.

medRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: Dec. 13, 2024

Abstract No existing algorithm can reliably identify metastasis from pathology reports across multiple cancer types and the entire US population. In this study, we develop a deep learning model that automatically detects patients with metastatic by using many laboratories of types. We trained validated our on cohort 29,632 four Surveillance, Epidemiology, End Results (SEER) registries linked to 60,471 unstructured reports. Our architecture task-specific data outperforms general-purpose LLM, recall 0.894 compared 0.824. quantified uncertainty used it defer for human review. found retaining 72.9% increased 0.969. This approach could streamline population-based surveillance help address unmet need capture recurrence or progression.

Language: Английский

Citations

0