Evaluation of Chat Generative Pre-trained Transformer and Microsoft Copilot Performance on the American Society of Surgery of the Hand Self-Assessment Examinations DOI Creative Commons

Taylor R. Rakauskas,

António Costa,

Claudio Moriconi

et al.

Journal of Hand Surgery Global Online, Journal Year: 2024, Volume and Issue: 7(1), P. 23 - 28

Published: Nov. 13, 2024

Artificial intelligence advancements have the potential to transform medical education and patient care. The increasing popularity of large language models has raised important questions regarding their accuracy agreement with human users. purpose this study was evaluate performance Chat Generative Pre-Trained Transformer (ChatGPT), versions 3.5 4, as well Microsoft Copilot, which is powered by ChatGPT-4, on self-assessment examination for hand surgery compare results between versions. Input included 1,000 across 5 years (2015-2019) examinations provided American Society Surgery Hand. primary outcomes correctness, percentage concordance relative other users, whether an additional prompt required. Secondary according question type difficulty. All formats including image-based were used analysis. ChatGPT-3.5 correctly answered 51.6% ChatGPT-4 63.4%, a statistically significant difference. Copilot 59.9% outperformed but scored significantly lower than ChatGPT-4. However, sided average 72.2% users when correct 62.1% incorrect, compared 67.0% 53.2% respectively, 79.7% 52.1% incorrect. highest scoring subject Miscellaneous, lowest Neuromuscular in all In study, perform better subspecialty did ChatGPT-3.5. more accurate ChatGPT3.5 less ChatGPT4. able "pass" 2015-2019 Hand examinations. While holding promise within education, caution should be detailed evaluation consistency needed. Future studies explore how these multiple trials contexts truly assess reliability.

Language: Английский

Comparing performances of french orthopaedic surgery residents with the artificial intelligence ChatGPT-4/4o in the French diploma exams of orthopaedic and trauma surgery DOI Creative Commons

Nabih Maraqa,

Ramy Samargandi,

A. Poichotte

et al.

Orthopaedics & Traumatology Surgery & Research, Journal Year: 2024, Volume and Issue: unknown, P. 104080 - 104080

Published: Dec. 1, 2024

Language: Английский

Citations

2

Assessing ChatGPT’s summarization of 68Ga PSMA PET/CT reports for patients DOI
Ogün Bülbül, Hande Melike Bülbül, Esat Kaba

et al.

Abdominal Radiology, Journal Year: 2024, Volume and Issue: unknown

Published: Sept. 30, 2024

Language: Английский

Citations

1

Artificial intelligence as an adjunctive tool in hand and wrist surgery: a review DOI Open Access

Said Dababneh,

Justine Colivas,

Nadine Dababneh

et al.

Artificial Intelligence Surgery, Journal Year: 2024, Volume and Issue: 4(3), P. 214 - 32

Published: Sept. 2, 2024

Artificial intelligence (AI) is currently utilized across numerous medical disciplines. Nevertheless, despite its promising advancements, AI’s integration in hand surgery remains early stages and has not yet been widely implemented, necessitating continued research to validate efficacy ensure safety. Therefore, this review aims provide an overview of the utilization AI surgery, emphasizing current application clinical practice, along with potential benefits associated challenges. A comprehensive literature search was conducted PubMed, Embase, Medline, Cochrane libraries, adhering Preferred reporting items for systematic reviews meta-analyses (PRISMA) guidelines. The focused on identifying articles related utilizing multiple relevant keywords. Each identified article assessed based title, abstract, full text. primary 1,228 articles; after inclusion/exclusion criteria manual bibliography included articles, a total 98 were covered review. wrist diagnostic, which includes fracture detection, carpal tunnel syndrome (CTS), avascular necrosis (AVN), osteoporosis screening. Other applications include residents’ training, patient-doctor communication, surgical assistance, outcome prediction. Consequently, very tool that though further necessary fully integrate it into practice.

Language: Английский

Citations

0

Evaluation of Chat Generative Pre-trained Transformer and Microsoft Copilot Performance on the American Society of Surgery of the Hand Self-Assessment Examinations DOI Creative Commons

Taylor R. Rakauskas,

António Costa,

Claudio Moriconi

et al.

Journal of Hand Surgery Global Online, Journal Year: 2024, Volume and Issue: 7(1), P. 23 - 28

Published: Nov. 13, 2024

Artificial intelligence advancements have the potential to transform medical education and patient care. The increasing popularity of large language models has raised important questions regarding their accuracy agreement with human users. purpose this study was evaluate performance Chat Generative Pre-Trained Transformer (ChatGPT), versions 3.5 4, as well Microsoft Copilot, which is powered by ChatGPT-4, on self-assessment examination for hand surgery compare results between versions. Input included 1,000 across 5 years (2015-2019) examinations provided American Society Surgery Hand. primary outcomes correctness, percentage concordance relative other users, whether an additional prompt required. Secondary according question type difficulty. All formats including image-based were used analysis. ChatGPT-3.5 correctly answered 51.6% ChatGPT-4 63.4%, a statistically significant difference. Copilot 59.9% outperformed but scored significantly lower than ChatGPT-4. However, sided average 72.2% users when correct 62.1% incorrect, compared 67.0% 53.2% respectively, 79.7% 52.1% incorrect. highest scoring subject Miscellaneous, lowest Neuromuscular in all In study, perform better subspecialty did ChatGPT-3.5. more accurate ChatGPT3.5 less ChatGPT4. able "pass" 2015-2019 Hand examinations. While holding promise within education, caution should be detailed evaluation consistency needed. Future studies explore how these multiple trials contexts truly assess reliability.

Language: Английский

Citations

0