Extracting Pulmonary Embolism Diagnoses From Radiology Impressions Using GPT-4o: Large Language Model Evaluation Study (Preprint) DOI
Mohammed A. Mahyoub, Kacie Dougherty,

Ajit Shukla

и другие.

Опубликована: Окт. 18, 2024

BACKGROUND Pulmonary embolism (PE) is a critical condition requiring rapid diagnosis to reduce mortality. Extracting PE diagnoses from radiology reports manually time-consuming, highlighting the need for automated solutions. Advances in natural language processing, especially transformer models like GPT-4o, offer promising tools improve diagnostic accuracy and workflow efficiency clinical settings. OBJECTIVE This study aimed develop an automatic extraction system using GPT-4o extract report impressions, enhancing decision-making efficiency. METHODS In total, 2 approaches were developed evaluated: fine-tuned Clinical Longformer as baseline model GPT-4o-based extractor. Longformer, encoder-only model, was chosen its robustness text classification tasks, particularly on smaller scales. decoder-only instruction-following LLM, selected advanced understanding capabilities. The evaluate GPT-4o’s ability perform compared Longformer. trained dataset of 1000 impressions validated separate set 200 samples, while extractor same 200-sample set. Postdeployment performance further assessed additional operational records efficacy real-world setting. RESULTS outperformed metrics, achieving sensitivity 1.0 (95% CI 1.0-1.0; Wilcoxon test, <i>P</i>&lt;.001) <i>F</i><sub>1</sub>-score 0.975 0.9495-0.9947; across validation dataset. evaluations also showed strong deployed with 1.0-1.0), specificity 0.94 0.8913-0.9804), 0.97 0.9479-0.9908). high level supports reduction manual review, streamlining workflows improving precision. CONCLUSIONS provides effective solution reports, offering reliable tool that aids timely accurate decision-making. approach has potential significantly patient outcomes by expediting treatment pathways conditions PE.

Язык: Английский

Evaluation of the performance of ChatGPT‐4 and ChatGPT‐4o as a learning tool in endodontics DOI Creative Commons
Esra Arılı Öztürk, Ceren Turan Gökduman, Burhan Can Çanakçı

и другие.

International Endodontic Journal, Год журнала: 2025, Номер unknown

Опубликована: Март 2, 2025

Abstract Aims The aim of this study was to evaluate the accuracy and consistency responses given by two different versions Chat Generative Pre‐trained Transformer (ChatGPT), ChatGPT‐4, ChatGPT‐4o, multiple‐choice questions prepared from undergraduate endodontic education topics at times day on days. Methodology In total, 60 multiple‐choice, text‐based 6 were prepared. Each question asked ChatGPT‐4 ChatGPT‐4o 3 a (morning, noon, evening) for consecutive AIs compared using SPSS R programs ( p < .05, 95% confidence interval). Results rate (92.8%) significantly higher than that (81.7%; .001). groups affected rates both which did not affect either AI > .05). There no statistically significant difference in between = .123). AI, too Conclusions According results study, better ChatGPT‐4. These findings demonstrate chatbots can be used dental education. However, it is also necessary consider limitations potential risks associated with AI.

Язык: Английский

Процитировано

0

Optimizing Clinical Data Availability: Extracting Pulmonary Embolism Diagnoses from Radiology Impressions with GPT-4o DOI Creative Commons
Mohammed A. Mahyoub, Kacie Dougherty,

Ajit Shukla

и другие.

medRxiv (Cold Spring Harbor Laboratory), Год журнала: 2024, Номер unknown

Опубликована: Окт. 15, 2024

Abstract Background Pulmonary embolism (PE) is a life-threatening condition that requires timely diagnosis to reduce mortality. Radiology reports, particularly the Impression sections, play critical role in diagnosing PE. However, manually extracting this information from large volumes of reports challenging. This study aims develop an advanced natural language processing (NLP) system using GPT-4o automatically extract PE diagnoses radiology report impressions, enhancing clinical workflows and decision-making. Materials Methods We developed two text classification models: fine-tuned Clinical Longformer (as baseline model) GPT-4o. Models were trained 1,000 impressions validated on 200 samples, with post-deployment evaluation conducted 500 operational records. The primary dataset was sourced electronic medical record relational database, key metrics such as sensitivity, specificity, F1 score used evaluate model performance. Results achieved superior performance 100% score, outperforming Longformer. Post-deployment, continued perform flawlessly, identifying all positive cases without false positives or negatives. successfully streamlined workflow, reducing burden manual review diagnostic accuracy.

Язык: Английский

Процитировано

0

Extracting Pulmonary Embolism Diagnoses From Radiology Impressions Using GPT-4o: Large Language Model Evaluation Study (Preprint) DOI
Mohammed A. Mahyoub, Kacie Dougherty,

Ajit Shukla

и другие.

Опубликована: Окт. 18, 2024

BACKGROUND Pulmonary embolism (PE) is a critical condition requiring rapid diagnosis to reduce mortality. Extracting PE diagnoses from radiology reports manually time-consuming, highlighting the need for automated solutions. Advances in natural language processing, especially transformer models like GPT-4o, offer promising tools improve diagnostic accuracy and workflow efficiency clinical settings. OBJECTIVE This study aimed develop an automatic extraction system using GPT-4o extract report impressions, enhancing decision-making efficiency. METHODS In total, 2 approaches were developed evaluated: fine-tuned Clinical Longformer as baseline model GPT-4o-based extractor. Longformer, encoder-only model, was chosen its robustness text classification tasks, particularly on smaller scales. decoder-only instruction-following LLM, selected advanced understanding capabilities. The evaluate GPT-4o’s ability perform compared Longformer. trained dataset of 1000 impressions validated separate set 200 samples, while extractor same 200-sample set. Postdeployment performance further assessed additional operational records efficacy real-world setting. RESULTS outperformed metrics, achieving sensitivity 1.0 (95% CI 1.0-1.0; Wilcoxon test, <i>P</i>&lt;.001) <i>F</i><sub>1</sub>-score 0.975 0.9495-0.9947; across validation dataset. evaluations also showed strong deployed with 1.0-1.0), specificity 0.94 0.8913-0.9804), 0.97 0.9479-0.9908). high level supports reduction manual review, streamlining workflows improving precision. CONCLUSIONS provides effective solution reports, offering reliable tool that aids timely accurate decision-making. approach has potential significantly patient outcomes by expediting treatment pathways conditions PE.

Язык: Английский

Процитировано

0