From promise to practice: challenges and pitfalls in the evaluation of large language models for data extraction in evidence synthesis DOI Creative Commons
Gerald Gartlehner, Leila C. Kahwati, Barbara Nussbaumer‐Streit

et al.

BMJ evidence-based medicine, Journal Year: 2024, Volume and Issue: unknown, P. bmjebm - 113199

Published: Dec. 20, 2024

Language: Английский

Systematic review and meta-analysis of preclinical studies DOI
Benjamin V. Ineichen, Ulrike Held, Georgia Salanti

et al.

Nature Reviews Methods Primers, Journal Year: 2024, Volume and Issue: 4(1)

Published: Oct. 3, 2024

Language: Английский

Citations

2

EvidenceTriangulator: A Large Language Model Approach to Synthesizing Causal Evidence across Study Designs DOI Creative Commons
Xuanyu Shi, Wenjing Zhao, Ting Chen

et al.

medRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: March 19, 2024

Abstract Health strategies increasingly emphasize both behavioral and biomedical interventions, yet the complex often contradictory guidance on diet, behavior, health outcomes complicates evidence-based decision-making. Evidence triangulation across diverse study designs is essential for establishing causality, but scalable, automated methods achieving this are lacking. In study, we assess performance of large language models (LLMs) in extracting ontological methodological information from scientific literature to automate evidence triangulation. A two-step extraction approach—focusing cause-effect concepts first, followed by relation extraction—outperformed a one-step method, particularly identifying effect direction statistical significance. Using salt intake blood pressure as case calculated Convergeny (CoE) Level (LoE), finding trending excitatory hypertension risk, with moderate LoE. This approach complements traditional meta-analyses integrating designs, thereby facilitating more comprehensive assessments public recommendations.

Language: Английский

Citations

1

Loon Lens 1.0 Validation: Agentic AI for Title and Abstract Screening in Systematic Literature Reviews DOI Creative Commons
Ghayath Janoudi, Mara Uzun,

Mia Jurdana

et al.

medRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: Sept. 6, 2024

Abstract Introduction Systematic literature reviews (SLRs) are critical for informing clinical research and practice, but they time-consuming resource-intensive, particularly during Title (TiAb) screening. Loon Lens, an autonomous, agentic AI platform, streamlines TiAb screening without the need human reviewers to conduct any Methods This study validates Lens against reviewer decisions across eight SLRs conducted by Canada’s Drug Agency, covering a range of drugs eligibility criteria. A total 3,796 citations were retrieved, with identifying 287 (7.6%) inclusion. autonomously screened same based on provided inclusion exclusion Metrics such as accuracy, recall, precision, F1 score, specificity, negative predictive value (NPV) calculated. Bootstrapping was applied compute 95% confidence intervals. Results achieved accuracy 95.5% (95% CI: 94.8–96.1), recall at 98.95% 97.57–100%) specificity 95.24% 94.54–95.89%). Precision lower 62.97% 58.39–67.27%), suggesting that included more full-text compared reviewers. The score 0.770 0.734–0.802), indicating strong balance between precision recall. Conclusion demonstrates ability substantial potential reducing time cost associated manual or semi-autonomous in SLRs. While improvements needed, platform offers scalable, autonomous solution systematic reviews. Access is available upon request https://loonlens.com/ .

Language: Английский

Citations

1

The Heap, the Hype, the Reality: Generative Pretrained Transformer for Systematic Reviews DOI
Tianjing Li, Stephanie Chang

Annals of Internal Medicine, Journal Year: 2024, Volume and Issue: 177(6), P. 828 - 829

Published: May 20, 2024

Language: Английский

Citations

0

French-Style Applications of Artificial Intelligence to Human Health in a European Context DOI

Dominique Desbois

IFIP advances in information and communication technology, Journal Year: 2024, Volume and Issue: unknown, P. 110 - 122

Published: Jan. 1, 2024

Language: Английский

Citations

0

LLMscreen: A Python Package for Systematic Review Screening of Scientific Texts Using Prompt Engineering DOI
Ziqian Xia, Jinquan Ye, Bo Hu

et al.

Research Square (Research Square), Journal Year: 2024, Volume and Issue: unknown

Published: Sept. 11, 2024

Abstract Systematic reviews represent a cornerstone of evidence-based research, yet the process is labor-intensive and time-consuming, often requiring substantial human resources. The advent Large Language Models (LLMs) offers novel approach to streamlining systematic reviews, particularly in title abstract screening phase. This study introduces new Python package built on LLMs accelerate this process, evaluating its performance across three datasets using distinct prompt strategies: single-prompt, k-value setting, zero-shot. setting emerged as most effective, achieving precision 0.649 reducing average error rate 0.4%, significantly lower than 10.76% typically observed among reviewers. Moreover, enabled 3,000 papers under 8 minutes, at cost only $0.30—an over 250-fold improvement time 2,000-fold efficiency compared traditional methods. These findings underscore potential enhance accuracy though further research needed address challenges related dataset variability model transparency. Expanding application other stages such data extraction synthesis, could streamline review making it more comprehensive less burdensome for researchers.

Language: Английский

Citations

0

From promise to practice: challenges and pitfalls in the evaluation of large language models for data extraction in evidence synthesis DOI Creative Commons
Gerald Gartlehner, Leila C. Kahwati, Barbara Nussbaumer‐Streit

et al.

BMJ evidence-based medicine, Journal Year: 2024, Volume and Issue: unknown, P. bmjebm - 113199

Published: Dec. 20, 2024

Language: Английский

Citations

0