Can large language models be new supportive tools in coronary computed tomography angiography reporting? DOI
Eren Çamur, Turay Cesur, Yasin Celal Güneş

и другие.

Clinical Imaging, Год журнала: 2024, Номер 114, С. 110271 - 110271

Опубликована: Авг. 31, 2024

Язык: Английский

Empowering Radiologists with ChatGPT-4o: Comparative Evaluation of Large Language Models and Radiologists in Cardiac Cases DOI Creative Commons
Turay Cesur, Yasin Celal Güneş, Eren Çamur

и другие.

medRxiv (Cold Spring Harbor Laboratory), Год журнала: 2024, Номер unknown

Опубликована: Июнь 25, 2024

ABSTRACT Purpose This study evaluated the diagnostic accuracy and differential diagnosis capabilities of 12 Large Language Models (LLMs), one cardiac radiologist, three general radiologists in radiology. The impact ChatGPT-4o assistance on radiologist performance was also investigated. Materials Methods We collected publicly available 80 “Cardiac Case Month’’ from Society Thoracic Radiology website. LLMs Radiologist-III were provided with text-based information, whereas other visually assessed cases without assistance. Diagnostic scores (DDx Score) analyzed using chi-square, Kruskal-Wallis, Wilcoxon, McNemar, Mann-Whitney U tests. Results unassisted 72.5%, General Radiologist-I 53.8%, Radiologist-II 51.3%. With ChatGPT-4o, improved to 78.8%, 70.0%, 63.8%, respectively. improvements for Radiologists-I II statistically significant (P≤0.006). All radiologists’ DDx significantly (P≤0.05). Remarkably, Radiologist-I’s GPT-4o-assisted Score not different Cardiac Radiologist’s (P>0.05). Among LLMs, Claude 3.5 Sonnet 3 Opus had highest (81.3%), followed by (70.0%). Regarding Score, outperformed all models (P<0.05). radiologist-III 48.8% 63.8% GPT4o-assistance (P<0.001). Conclusion may enhance imaging, suggesting its potential as a valuable support tool. Further research is required assess clinical integration.

Язык: Английский

Процитировано

1

Comparison of Performance of Large Language Models on Lung-RADS Related Questions DOI Creative Commons
Eren Çamur, Turay Cesur, Yasin Celal Güneş

и другие.

JCO Global Oncology, Год журнала: 2024, Номер 10

Опубликована: Авг. 1, 2024

This study evaluates LLM integration in interpreting Lung-RADS for lung cancer screening, highlighting their innovative role enhancing radiological practice. Our findings reveal that Claude 3 Opus and Perplexity achieved a 96% accuracy rate, outperforming other models.

Язык: Английский

Процитировано

1

Comparison of the performance of large language models and general radiologist on Ovarian-Adnexal Reporting and Data System (O-RADS)-related questions DOI Open Access
Eren Çamur, Turay Cesur, Yasin Celal Güneş

и другие.

Quantitative Imaging in Medicine and Surgery, Год журнала: 2024, Номер 14(9), С. 6990 - 6991

Опубликована: Июль 24, 2024

Язык: Английский

Процитировано

0

Comparison of the Knowledge of Large Language Models and General Radiologist on RECIST (Preprint) DOI Creative Commons
Eren Çamur, Turay Cesur, Yasin Celal Güneş

и другие.

Опубликована: Июль 26, 2024

UNSTRUCTURED This study aims to assess the potential of large language models (LLMs) enhance reporting efficiency and accuracy in oncological imaging, specifically evaluating their knowledge RECIST 1.1 guidelines. While capabilities LLMs have been explored across various domains, specific applications radiology are significant interest due intricate time-consuming nature image evaluation oncology. We conducted a comparative analysis involving seven different general radiologist (GR) determine proficiency responding 1.1-based multiple-choice questions. Our methodology involved creation 25 questions by board-certified radiologist, ensuring alignment with These were presented LLMs—Claude 3 Opus, ChatGPT 4, 4o, Gemini 1.5 Pro, Mistral Large, Meta Llama 70B, Perplexity Pro—as well as GR six years experience. The prompted answer an experienced responses compared those GR. results demonstrated that Claude Opus achieved perfect 100% (25/25), followed closely 4o 96% (24/25). 4 Large both scored 92% (23/25), while each 88% (21/25). also score (23/25). findings highlight impressive current understanding applying guidelines, suggesting valuable tools radiology. outstanding performance raises prospect becoming integral oncology practices, potentially enhancing reporting. However, variations among underscore need for further refinement evaluation. Additionally, this focused on text-based responses, visual assessment multimodal remain unexplored. Given radiology, future research should investigate integration fully harness clinical settings. In conclusion, our underscores high assist radiologists reporting, providing consistent reliable approach interpreting advocate continued development diagnostic efficiency.

Язык: Английский

Процитировано

0

Comparison of the Knowledge of Large Language Models and General Radiologist on RECIST (Preprint) DOI Creative Commons
Eren Çamur

Опубликована: Авг. 13, 2024

BACKGROUND Large language models (LLMs) represent a remarkable breakthrough in natural processing. What sets the current generation of LLMs apart is their ability to perform very specific tasks radiology, as many other fields, without need for additional training. have potential usher new era efficiency and excellence radiology practice, both supportive diagnostic tool facilitate reporting process. This great importance oncology oncological researchers been conducting studies these fields order demonstrate position LLMs. OBJECTIVE We aimed provide perspective on improve imaging by comparatively assessing LLMs' knowledge RECIST 1.1 among themselves with general radiologist (GR). METHODS Radiologist (E.Ç.) prepared 25 multiple-choice questions this study utilizing information 1.1, thus eliminating ethics committee approval. initiated input prompt follows: ‘‘Act like professor who has 30 years experience radiology.. Give just letter most correct choice multiple questions. Each question only one answer.’’This was tested June 2024 seven different using default settings. The testing included from various developers: Claude 3 Opus, ChatGPT 4 4o, Gemini 1.5 pro, Mistral Large, Meta Llama 70B, Perplexity pro. Also GR (T.C.) board certified EDiR 6 each, answered same RESULTS results revealed that Opus achieved highest accuracy 100% (25/25 questions), followed newest model Open AI’s 4o 96% (24/25 questions). 92% (23/25 70 B, pro had 88% (21/25 CONCLUSIONS outstanding success knowing all raises whether can be star radiology. Our reveals majority LLM exhibit commendable level proficiency comperable answering related findings show more than sufficient text-based about 1.1. Additionally, our underscore high tools assist radiologists reporting. However, take full advantage abilities reporting, it visual are also evaluated. Visual evaluation forms basis Therefore, future should focus evaluating multimodal ability. In conclusion, every field will reveal allow easily integrated into radiological practice. CLINICALTRIAL There no trial registration. INTERNATIONAL REGISTERED REPORT RR2-10.2196/preprints.64805

Язык: Английский

Процитировано

0

Can large language models be new supportive tools in coronary computed tomography angiography reporting? DOI
Eren Çamur, Turay Cesur, Yasin Celal Güneş

и другие.

Clinical Imaging, Год журнала: 2024, Номер 114, С. 110271 - 110271

Опубликована: Авг. 31, 2024

Язык: Английский

Процитировано

0