Опубликована: Дек. 30, 2024
Язык: Английский
Опубликована: Дек. 30, 2024
Язык: Английский
medRxiv (Cold Spring Harbor Laboratory), Год журнала: 2024, Номер unknown
Опубликована: Сен. 3, 2024
Abstract Background Large Language Models (LLMs) show promise in medical diagnosis, but their performance varies with prompting. Recent studies suggest that modifying prompts may enhance diagnostic capabilities. Objective This study aimed to test whether a prompting approach aligns general clinical reasoning methodology—specifically, separating processes of summarizing information and making diagnoses based on the summary instead one-step processing—can LLM’s capabilities Methods 322 quiz questions from Radiology’s Diagnosis Please cases (1998-2023) were used. We employed Claude 3.5 Sonnet, state-of-the-art LLM, compare three approaches: 1) Conventional zero-shot chain-of-thought prompt, as baseline, 2) two-step approach: LLM organizes patient history imaging findings, then provides diagnoses, 3) Summary-only Using only LLM-generated for diagnoses. Results The significantly outperformed both baseline summary-only methods diagnosis accuracy, determined by McNemar tests. Primary accuracy was 60.6% approach, compared 56.5% (p=0.042) 56.3% (p=0.035). For top 70.5%, 66.5%, 65.5% respectively (p=0.005 p=0.008 summary-only). No significant differences observed between approaches. Conclusion Our results indicate structured enhances accuracy. method shows potential valuable tool deriving free-text information. well established processes, suggesting its applicability real-world settings.
Язык: Английский
Процитировано
0Journal of the Korean Society of Radiology, Год журнала: 2024, Номер 85(5), С. 861 - 861
Опубликована: Янв. 1, 2024
Large language models (LLMs) have revolutionized the global landscape of technology beyond field natural processing. Owing to their extensive pre-training using vast datasets, contemporary LLMs can handle tasks ranging from general functionalities domain-specific areas, such as radiology, without need for additional fine-tuning. Importantly, are on a trajectory rapid evolution, addressing challenges hallucination, bias in training data, high costs, performance drift, and privacy issues, along with inclusion multimodal inputs. The concept small, on-premise open source has garnered growing interest, fine-tuning medical domain knowledge, efficiency managing drift be effectively simultaneously achieved. This review provides conceptual actionable guidance, an overview current technological future directions radiologists.
Язык: Английский
Процитировано
0Neuro-Oncology Advances, Год журнала: 2024, Номер unknown
Опубликована: Дек. 23, 2024
Abstract Background This study aimed to explore the potential of Advanced Data Analytics (ADA) package GPT-4 autonomously develop Machine-Learning Models (ML) for predicting glioma molecular types using radiomics from MRI. Methods Radiomic features were extracted preoperative MRI n=615 newly diagnosed patients predict (IDH-wildtype vs. IDH-mutant 1p19q-codeleted 1p19q-non-codeleted) with a multiclass ML approach. Specifically, ADA was used pipeline and benchmark performance against an established handcrafted model various normalization methods (N4, Z-score, WhiteStripe). External validation performed on two public datasets D2 (n=160) D3 (n=410). Results achieved highest accuracy 0.820 (95%CI=0.819-0.821) dataset N4/WS normalization, significantly outperforming model’s 0.678 (95% CI=0.677-0.680) (p<0.001). Class-wise analysis showed variations across different types. In IDH-wildtype group, had recall 0.997 (95%CI=0.997-0.997), surpassing benchmark's 0.742 (95%CI=0.740-0.743). For IDH-mut 1p/19q-non-codel GPT-4's 0.275 (95%CI=0.272-0.279), lower than 0.426 (95%CI=0.423-0.430). 1p/19q-codel 0.199 (95%CI=0.191-0.206), below 0.730 (95%CI=0.721-0.738). On dataset, (p<0.001) benchmark's, achieving 0.668 (95%CI=0.666-0.671) compared 0.719 (95%CI=0.717-0.722) revealed same pattern as observed in D3. Conclusions can radiomics-based MLMs, comparable MLMs. However, its poorer class-wise due unbalanced shows limitations handling complete end-to-end pipelines.
Язык: Английский
Процитировано
0Опубликована: Дек. 30, 2024
Язык: Английский
Процитировано
0