Consensus-Based Reasoning with Locally Deployed LLMs for Structured Data Extraction from Surgical Pathology Reports DOI Creative Commons
Aakash Tripathi, Asim Waqas, Ehsan Ullah

et al.

medRxiv (Cold Spring Harbor Laboratory), Journal Year: 2025, Volume and Issue: unknown

Published: April 25, 2025

Surgical pathology reports contain essential diagnostic information, in free-text form, required for cancer staging, treatment planning, and registry documentation. However, their unstructured nature variability across tumor types institutions pose challenges automated data extraction. We present a consensus-driven, reasoning-based framework that uses multiple locally deployed large language models (LLMs) to extract six key variables: site, laterality, histology, stage, grade, behavior. Each LLM produces structured outputs with accompanying justifications, which are evaluated accuracy coherence by separate reasoning model. Final consensus values determined through aggregation, expert validation is conducted board-certified or equivalent pathologists. The was applied over 4,000 from Cancer Genome Atlas (TCGA) Moffitt Center. Expert review confirmed high agreement the TCGA dataset behavior (100.0%), histology (98.5%), site (95.2%), grade (95.6%), lower performance stage (87.6%) laterality (84.8%). In (brain, breast, lung), remained variables, (98.3%), (92.4%), achieving strong agreement. certain emerged, such as inconsistent mention of sentinel lymph node details anatomical ambiguity biopsy interpretations. Statistical analyses revealed significant main effects model type, variable, organ system, well × variable interactions, emphasizing role clinical context performance. These results highlight importance stratified, multi-organ evaluation frameworks benchmarking applications. Textual justifications enhanced interpretability enabled human reviewers audit outputs. Overall, this consensus-based approach demonstrates LLMs can provide transparent, accurate, auditable solution integrating AI-driven extraction into real-world workflows, including abstraction synoptic reporting.

Language: Английский

A guide to artificial intelligence for cancer researchers DOI
Raquel Pérez-López, Narmin Ghaffari Laleh, Faisal Mahmood

et al.

Nature reviews. Cancer, Journal Year: 2024, Volume and Issue: 24(6), P. 427 - 441

Published: May 16, 2024

Language: Английский

Citations

67

A future role for health applications of large language models depends on regulators enforcing safety standards DOI Creative Commons
Oscar Freyer, Isabella C. Wiest, Jakob Nikolas Kather

et al.

The Lancet Digital Health, Journal Year: 2024, Volume and Issue: 6(9), P. e662 - e672

Published: Aug. 23, 2024

Among the rapid integration of artificial intelligence in clinical settings, large language models (LLMs), such as Generative Pre-trained Transformer-4, have emerged multifaceted tools that potential for health-care delivery, diagnosis, and patient care. However, deployment LLMs raises substantial regulatory safety concerns. Due to their high output variability, poor inherent explainability, risk so-called AI hallucinations, LLM-based applications serve a medical purpose face challenges approval devices under US EU laws, including recently passed Artificial Intelligence Act. Despite unaddressed risks patients, misdiagnosis unverified advice, are available on market. The ambiguity surrounding these creates an urgent need frameworks accommodate unique capabilities limitations. Alongside development frameworks, existing regulations should be enforced. If regulators fear enforcing market dominated by supply or technology companies, consequences layperson harm will force belated action, damaging potentiality advice.

Language: Английский

Citations

27

Augmented non-hallucinating large language models as medical information curators DOI Creative Commons
Stephen Gilbert, Jakob Nikolas Kather, Aidan Hogan

et al.

npj Digital Medicine, Journal Year: 2024, Volume and Issue: 7(1)

Published: April 23, 2024

Reliably processing and interlinking medical information has been recognized as a critical foundation to the digital transformation of workflows, despite development ontologies, optimization these major bottleneck medicine. The advent large language models brought great excitement, maybe solution medicines' 'communication problem' is in sight, but how can known weaknesses models, such hallucination non-determinism, be tempered? Retrieval Augmented Generation, particularly through knowledge graphs, an automated approach that deliver structured reasoning model truth alongside LLMs, relevant structuring therefore also decision support.

Language: Английский

Citations

16

Ethical Considerations in Human-Centered AI: Advancing Oncology Chatbots through Large Language Models (Preprint) DOI Creative Commons
James C. L. Chow, Kay Li

JMIR Bioinformatics and Biotechnology, Journal Year: 2024, Volume and Issue: 5, P. e64406 - e64406

Published: Sept. 25, 2024

The integration of chatbots in oncology underscores the pressing need for human-centered artificial intelligence (AI) that addresses patient and family concerns with empathy precision. Human-centered AI emphasizes ethical principles, empathy, user-centric approaches, ensuring technology aligns human values needs. This review critically examines implications using large language models (LLMs) like GPT-3 GPT-4 (OpenAI) chatbots. It how these replicate human-like patterns, impacting design systems. paper identifies key strategies ethically developing chatbots, focusing on potential biases arising from extensive datasets neural networks. Specific datasets, such as those sourced predominantly Western medical literature interactions, may introduce by overrepresenting certain demographic groups. Moreover, training methodologies LLMs, including fine-tuning processes, can exacerbate biases, leading to outputs disproportionately favor affluent or populations while neglecting marginalized communities. By providing examples biased highlights challenges LLMs present mitigation strategies. study integrating human-centric into mitigate ultimately advocating development are aligned principles capable serving diverse equitably.

Language: Английский

Citations

15

A comparative study of large language model-based zero-shot inference and task-specific supervised classification of breast cancer pathology reports DOI Creative Commons
Madhumita Sushil, Travis Zack, Divneet Mandair

et al.

Journal of the American Medical Informatics Association, Journal Year: 2024, Volume and Issue: 31(10), P. 2315 - 2327

Published: June 20, 2024

Although supervised machine learning is popular for information extraction from clinical notes, creating large annotated datasets requires extensive domain expertise and time-consuming. Meanwhile, language models (LLMs) have demonstrated promising transfer capability. In this study, we explored whether recent LLMs could reduce the need large-scale data annotations.

Language: Английский

Citations

14

Large language models could make natural language again the universal interface of healthcare DOI
Jakob Nikolas Kather, Dyke Ferber, Isabella C. Wiest

et al.

Nature Medicine, Journal Year: 2024, Volume and Issue: 30(10), P. 2708 - 2710

Published: Aug. 23, 2024

Language: Английский

Citations

9

Enhancing doctor-patient communication using large language models for pathology report interpretation DOI Creative Commons
Xiongwen Yang, Yi Xiao, Di Liu

et al.

BMC Medical Informatics and Decision Making, Journal Year: 2025, Volume and Issue: 25(1)

Published: Jan. 23, 2025

Large language models (LLMs) are increasingly utilized in healthcare settings. Postoperative pathology reports, which essential for diagnosing and determining treatment strategies surgical patients, frequently include complex data that can be challenging patients to comprehend. This complexity adversely affect the quality of communication between doctors about their diagnosis options, potentially impacting patient outcomes such as understanding condition, adherence, overall satisfaction. study analyzed text reports from four hospitals October December 2023, focusing on malignant tumors. Using GPT-4, we developed templates interpretive (IPRs) simplify medical terminology non-professionals. We randomly selected 70 generate these evaluated remaining 628 consistency readability. Patient was measured using a custom-designed report level assessment scale, scored by volunteers with no background. The also recorded doctor-patient time comprehension levels before after IPRs. Among 698 analyzed, interpretation through LLMs significantly improved readability understanding. average decreased over 70%, 35 10 min (P < 0.001), use found higher when provided AI-generated 5.23 points 7.98 indicating an effective translation information. Consistency original (OPRs) IPRs evaluated, results showing high across all assessed dimensions, achieving score 4.95 out 5. research demonstrates efficacy like GPT-4 enhancing translating into more accessible language. While this did not directly measure or satisfaction, it provides evidence reduced may positively influence engagement. These findings highlight potential AI bridge gaps professionals public environments.

Language: Английский

Citations

1

Emerging applications of NLP and large language models in gastroenterology and hepatology: a systematic review DOI Creative Commons
Mahmud Omar,

Salih Nassar,

Κassem Sharif

et al.

Frontiers in Medicine, Journal Year: 2025, Volume and Issue: 11

Published: Jan. 22, 2025

Background and aim In the last years, natural language processing (NLP) has transformed significantly with introduction of large models (LLM). This review updates on NLP LLM applications challenges in gastroenterology hepatology. Methods Registered PROSPERO (CRD42024542275) adhering to PRISMA guidelines, we searched six databases for relevant studies published from 2003 2024, ultimately including 57 studies. Results Our notes an increase publications 2023–2024 compared previous reflecting growing interest newer such as GPT-3 GPT-4. The results demonstrate that have enhanced data extraction electronic health records other unstructured medical sources. Key findings include high precision identifying disease characteristics reports ongoing improvement clinical decision-making. Risk bias assessments using ROBINS-I, QUADAS-2, PROBAST tools confirmed methodological robustness included Conclusion LLMs can enhance diagnosis treatment They enable records, endoscopy patient notes, enhancing Despite these advancements, integrating into routine practice is still challenging. Future work should prospectively real-world value.

Language: Английский

Citations

1

Applications of Large Language Models in Pathology DOI Creative Commons
Jerome Cheng

Bioengineering, Journal Year: 2024, Volume and Issue: 11(4), P. 342 - 342

Published: March 31, 2024

Large language models (LLMs) are transformer-based neural networks that can provide human-like responses to questions and instructions. LLMs generate educational material, summarize text, extract structured data from free create reports, write programs, potentially assist in case sign-out. combined with vision interpreting histopathology images. have immense potential transforming pathology practice education, but these not infallible, so any artificial intelligence generated content must be verified reputable sources. Caution exercised on how integrated into clinical practice, as produce hallucinations incorrect results, an over-reliance may lead de-skilling automation bias. This review paper provides a brief history of highlights several use cases for the field pathology.

Language: Английский

Citations

8

Evaluating the Prevalence of Burnout Among Health Care Professionals Related to Electronic Health Record Use: Systematic Review and Meta-Analysis DOI Creative Commons
Yuxuan Wu, Mingyue Wu, Changyu Wang

et al.

JMIR Medical Informatics, Journal Year: 2024, Volume and Issue: 12, P. e54811 - e54811

Published: April 17, 2024

Background Burnout among health care professionals is a significant concern, with detrimental effects on service quality and patient outcomes. The use of the electronic record (EHR) system has been identified as contributor to burnout professionals. Objective This systematic review meta-analysis aims assess prevalence associated EHR system, thereby providing evidence improve information systems develop strategies measure mitigate burnout. Methods We conducted comprehensive search PubMed, Embase, Web Science databases for English-language peer-reviewed articles published between January 1, 2009, December 31, 2022. Two independent reviewers applied inclusion exclusion criteria, study was assessed using Joanna Briggs Institute checklist Newcastle-Ottawa Scale. Meta-analyses were performed R (version 4.1.3; Foundation Statistical Computing), EndNote X7 (Clarivate) reference management. Results included 32 cross-sectional studies 5 case-control total 66,556 participants, mainly physicians registered nurses. pooled in 40.4% (95% CI 37.5%-43.2%). Case-control indicated higher likelihood who spent more time EHR-related tasks outside work (odds ratio 2.43, 95% 2.31-2.57). Conclusions findings highlight association increased Potential solutions include optimizing systems, implementing automated dictation or note-taking, employing scribes reduce documentation burden, leveraging artificial intelligence enhance efficiency risk Trial Registration PROSPERO International Prospective Register Systematic Reviews CRD42021281173; https://www.crd.york.ac.uk/prospero/display_record.php?ID=CRD42021281173

Language: Английский

Citations

7