
European Radiology, Journal Year: 2024, Volume and Issue: 34(12), P. 7728 - 7730
Published: July 9, 2024
Language: Английский
European Radiology, Journal Year: 2024, Volume and Issue: 34(12), P. 7728 - 7730
Published: July 9, 2024
Language: Английский
Radiology, Journal Year: 2025, Volume and Issue: 314(1)
Published: Jan. 1, 2025
Textual descriptions of radiologic image findings play a critical role in GPT-4 with vision–based differential diagnosis, underlining the importance radiologist experts even multimodal large language models.
Language: Английский
Citations
1European Radiology, Journal Year: 2025, Volume and Issue: unknown
Published: March 7, 2025
This study investigated the impact of human-large language model (LLM) collaboration on accuracy and efficiency brain MRI differential diagnosis. In this retrospective study, forty cases with a challenging but definitive diagnosis were randomized into two groups twenty each. Six radiology residents an average experience 6.3 months in reading exams evaluated one set supported by conventional internet search (Conventional) other utilizing LLM-based engine hybrid chatbot. A cross-over design ensured that each case was examined both workflows equal frequency. For case, readers instructed to determine three most likely diagnoses. LLM responses analyzed panel radiologists. Benefits challenges human-LLM interaction derived from observations participant feedback. LLM-assisted yielded superior (70/114; 61.4% (LLM-assisted) vs 53/114; 46.5% (conventional) correct diagnoses, p = 0.033, chi-square test). No difference interpretation time or level confidence observed. An analysis revealed suggestions translated reader 82.1% (60/73). Inaccurate descriptions (9.2% cases), hallucinations (11.5% insufficient contextualization identified as related interaction. Human-LLM has potential improve Yet, several must be addressed ensure effective adoption user acceptance. Question While large models have support radiological diagnosis, role context remains underexplored. Findings over search. descriptions, hallucinations, challenges. Clinical relevance Our results highlight workflow increase diagnostic underline necessity collaborative efforts between humans LLMs isolation.
Language: Английский
Citations
1npj Digital Medicine, Journal Year: 2025, Volume and Issue: 8(1)
Published: Feb. 12, 2025
Abstract Recent advancements in large language models (LLMs) have created new ways to support radiological diagnostics. While both open-source and proprietary LLMs can address privacy concerns through local or cloud deployment, provide advantages continuity of access, potentially lower costs. This study evaluated the diagnostic performance fifteen one closed-source LLM (GPT-4o) 1,933 cases from Eurorad library. provided differential diagnoses based on clinical history imaging findings. Responses were considered correct if true diagnosis appeared top three suggestions. Models further tested 60 non-public brain MRI a tertiary hospital assess generalizability. In datasets, GPT-4o demonstrated superior performance, closely followed by Llama-3-70B, revealing how are rapidly closing gap models. Our findings highlight potential as decision tools for challenging, real-world cases.
Language: Английский
Citations
0Digital Health, Journal Year: 2025, Volume and Issue: 11
Published: Feb. 1, 2025
Background This study evaluates the performance of GPT-4o in detecting errors ACR TIRADS ultrasound reports and its potential to reduce report generation time. Methods A retrospective analysis 200 thyroid from Second Affiliated Hospital Fujian Medical University was conducted, with categorized as correct or containing up three errors. GPT-4o's compared physicians varying experience levels error detection processing Results detected 90.0% (180/200) errors, slightly less than best-performing senior physician's 93.0% (186/200) no significant difference ( p = 0.281). rate comparable that overall 0.098 0.866). It outperformed Resident 2 diagnostic (87% vs. 69%). Reader agreement low (Cohen's kappa 0 0.31). reviewed significantly faster all (0.79 1.8 3.1 h, < 0.001), making it a reliable efficient tool for medical imaging. Conclusions is experienced improves efficiency, offering valuable enhancing accuracy aiding junior residents.
Language: Английский
Citations
0European Radiology, Journal Year: 2025, Volume and Issue: unknown
Published: March 15, 2025
Abstract Objectives To compare the impact of on-table monitoring vs standard-of-care multiparametric MRI (mpMRI) for utilisation gadolinium contrast use in prostate MRI. Materials and methods This retrospective observation study prospectively acquired data was conducted at a single institution over an 18-month period. A cohort patients undergoing suspected cancer (PCa) underwent where their T2 DWI images were reviewed by supervising radiologist during scan to decide whether acquire dynamic contrast-enhanced (DCE) sequences. scans reported using PI-RADS v2.1, followed up with biopsy least 12 months. The rate administration, rates, diagnostic accuracy compared that control group mpMRI same period propensity score matching. Estimates cost savings also calculated. Results 1410 identified after matching 598 analysed, 178 monitoring. Seventy-five eight tenths (135/178) did not receive gadolinium. Contrast used mainly indeterminate lesions (27/43) significant artefacts on bpMRI (14/43). When comparing monitored non-monitored group, there comparable number biopsies performed (52.2% 49.5%, p = 0.54), 3/5 scoring rates (10.1% 7.4%, 0.27), sensitivity (98.3% 99.2%, 0.56), specificity (63.9% 70.7%, 0.18) detection clinically-significant PCa. acquired, DCE deemed helpful 67.4% (29/43) cases improved both PI-QUALv2 reader confidence scores. There estimated saving £56,677 study. Conclusion On-table significantly reduced need without compromising rates. Key Points Question Default is always clinical benefit has associated side effects healthcare costs . Findings avoided 75.8% patients, reducing whilst maintaining clinically detection, improving Clinical relevance O n-table offers personalised patient protocolling reduction its costs, potentially maximising advantages biparametric Graphical
Language: Английский
Citations
0Experimental Hematology and Oncology, Journal Year: 2024, Volume and Issue: 13(1)
Published: July 27, 2024
Abstract The generation of radiological results from image data represents a pivotal aspect medical analysis. latest iteration ChatGPT-4, large multimodal model that integrates both text and inputs, including dermatoscopy images, histology X-ray has attracted considerable attention in the field radiology. To further investigate performance ChatGPT-4 recognition, we examined ability to recognize credible osteosarcoma images. demonstrated can more accurately diagnose bone with or without significant space-occupying lesions but limited differentiate between malignant compared adjacent normal tissue. Thus far, current capabilities are insufficient make reliable imaging diagnosis osteosarcoma. Therefore, users should be aware limitations this technology.
Language: Английский
Citations
2medRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown
Published: June 25, 2024
ABSTRACT Purpose This study evaluated the diagnostic accuracy and differential diagnosis capabilities of 12 Large Language Models (LLMs), one cardiac radiologist, three general radiologists in radiology. The impact ChatGPT-4o assistance on radiologist performance was also investigated. Materials Methods We collected publicly available 80 “Cardiac Case Month’’ from Society Thoracic Radiology website. LLMs Radiologist-III were provided with text-based information, whereas other visually assessed cases without assistance. Diagnostic scores (DDx Score) analyzed using chi-square, Kruskal-Wallis, Wilcoxon, McNemar, Mann-Whitney U tests. Results unassisted 72.5%, General Radiologist-I 53.8%, Radiologist-II 51.3%. With ChatGPT-4o, improved to 78.8%, 70.0%, 63.8%, respectively. improvements for Radiologists-I II statistically significant (P≤0.006). All radiologists’ DDx significantly (P≤0.05). Remarkably, Radiologist-I’s GPT-4o-assisted Score not different Cardiac Radiologist’s (P>0.05). Among LLMs, Claude 3.5 Sonnet 3 Opus had highest (81.3%), followed by (70.0%). Regarding Score, outperformed all models (P<0.05). radiologist-III 48.8% 63.8% GPT4o-assistance (P<0.001). Conclusion may enhance imaging, suggesting its potential as a valuable support tool. Further research is required assess clinical integration.
Language: Английский
Citations
1European Radiology, Journal Year: 2024, Volume and Issue: unknown
Published: Oct. 18, 2024
Abstract Objectives ChatGPT-4 Vision (GPT-4V) is a state-of-the-art multimodal large language model (LLM) that may be queried using images. We aimed to evaluate the tool’s diagnostic performance when autonomously assessing clinical imaging studies. Materials and methods A total of 206 studies (i.e., radiography ( n = 60), CT MRI angiography 26)) with unequivocal findings established reference diagnoses from radiologic practice university hospital were accessed. Readings performed uncontextualized, only image provided, contextualized, additional demographic information. Responses assessed along multiple dimensions analyzed appropriate statistical tests. Results With its pronounced propensity favor context over information, accuracy improved 8.3% (uncontextualized) 29.1% (contextualized, first diagnosis correct) 63.6% correct among differential diagnoses) p ≤ 0.001, Cochran’s Q test). Diagnostic declined by up 30% 20 images re-read after 30 90 days seemed unrelated self-reported confidence (Spearman’s ρ 0.117 0.776)). While described matched suggested in 92.7%, indicating valid reasoning, tool fabricated 258 412 responses misidentified modalities or anatomic regions 65 Conclusion GPT-4V, current form, cannot reliably interpret Its tendency disregard image, fabricate findings, misidentify details, especially without context, misguide healthcare providers put patients at risk. Key Points Question Can Generative Pre-trained Transformer 4 images—with context? Findings GPT-4V poorly, demonstrating rates 8% (uncontextualized), 29% most likely correct), 64% diagnoses). Clinical relevance The utility commercial models, such as limited. Without errors compromise patient safety decision-making. These models must further refined beneficial.
Language: Английский
Citations
1Clinical Imaging, Journal Year: 2024, Volume and Issue: 117, P. 110356 - 110356
Published: Nov. 13, 2024
Language: Английский
Citations
1European Radiology, Journal Year: 2024, Volume and Issue: 34(12), P. 7728 - 7730
Published: July 9, 2024
Language: Английский
Citations
0