Evaluation of Chat Generative Pre-trained Transformer and Microsoft Copilot Performance on the American Society of Surgery of the Hand Self-Assessment Examinations DOI Creative Commons

Taylor R. Rakauskas,

António Costa,

Claudio Moriconi

и другие.

Journal of Hand Surgery Global Online, Год журнала: 2024, Номер 7(1), С. 23 - 28

Опубликована: Ноя. 13, 2024

Artificial intelligence advancements have the potential to transform medical education and patient care. The increasing popularity of large language models has raised important questions regarding their accuracy agreement with human users. purpose this study was evaluate performance Chat Generative Pre-Trained Transformer (ChatGPT), versions 3.5 4, as well Microsoft Copilot, which is powered by ChatGPT-4, on self-assessment examination for hand surgery compare results between versions. Input included 1,000 across 5 years (2015-2019) examinations provided American Society Surgery Hand. primary outcomes correctness, percentage concordance relative other users, whether an additional prompt required. Secondary according question type difficulty. All formats including image-based were used analysis. ChatGPT-3.5 correctly answered 51.6% ChatGPT-4 63.4%, a statistically significant difference. Copilot 59.9% outperformed but scored significantly lower than ChatGPT-4. However, sided average 72.2% users when correct 62.1% incorrect, compared 67.0% 53.2% respectively, 79.7% 52.1% incorrect. highest scoring subject Miscellaneous, lowest Neuromuscular in all In study, perform better subspecialty did ChatGPT-3.5. more accurate ChatGPT3.5 less ChatGPT4. able "pass" 2015-2019 Hand examinations. While holding promise within education, caution should be detailed evaluation consistency needed. Future studies explore how these multiple trials contexts truly assess reliability.

Язык: Английский

Editorial – Current capacities and future possibilities of large language models in orthopaedic surgery DOI Creative Commons
Assil Mahamid, Lior Laver, Sophie Zahalka

и другие.

Journal of Experimental Orthopaedics, Год журнала: 2025, Номер 12(2)

Опубликована: Апрель 1, 2025

Язык: Английский

Процитировано

0

Comparing performances of french orthopaedic surgery residents with the artificial intelligence ChatGPT-4/4o in the French diploma exams of orthopaedic and trauma surgery DOI Creative Commons

Nabih Maraqa,

Ramy Samargandi,

A. Poichotte

и другие.

Orthopaedics & Traumatology Surgery & Research, Год журнала: 2024, Номер unknown, С. 104080 - 104080

Опубликована: Дек. 1, 2024

Язык: Английский

Процитировано

2

Assessing ChatGPT’s summarization of 68Ga PSMA PET/CT reports for patients DOI
Ogün Bülbül, Hande Melike Bülbül, Esat Kaba

и другие.

Abdominal Radiology, Год журнала: 2024, Номер unknown

Опубликована: Сен. 30, 2024

Язык: Английский

Процитировано

1

Artificial intelligence as an adjunctive tool in hand and wrist surgery: a review DOI Open Access

Said Dababneh,

Justine Colivas,

Nadine Dababneh

и другие.

Artificial Intelligence Surgery, Год журнала: 2024, Номер 4(3), С. 214 - 32

Опубликована: Сен. 2, 2024

Artificial intelligence (AI) is currently utilized across numerous medical disciplines. Nevertheless, despite its promising advancements, AI’s integration in hand surgery remains early stages and has not yet been widely implemented, necessitating continued research to validate efficacy ensure safety. Therefore, this review aims provide an overview of the utilization AI surgery, emphasizing current application clinical practice, along with potential benefits associated challenges. A comprehensive literature search was conducted PubMed, Embase, Medline, Cochrane libraries, adhering Preferred reporting items for systematic reviews meta-analyses (PRISMA) guidelines. The focused on identifying articles related utilizing multiple relevant keywords. Each identified article assessed based title, abstract, full text. primary 1,228 articles; after inclusion/exclusion criteria manual bibliography included articles, a total 98 were covered review. wrist diagnostic, which includes fracture detection, carpal tunnel syndrome (CTS), avascular necrosis (AVN), osteoporosis screening. Other applications include residents’ training, patient-doctor communication, surgical assistance, outcome prediction. Consequently, very tool that though further necessary fully integrate it into practice.

Язык: Английский

Процитировано

0

Evaluation of Chat Generative Pre-trained Transformer and Microsoft Copilot Performance on the American Society of Surgery of the Hand Self-Assessment Examinations DOI Creative Commons

Taylor R. Rakauskas,

António Costa,

Claudio Moriconi

и другие.

Journal of Hand Surgery Global Online, Год журнала: 2024, Номер 7(1), С. 23 - 28

Опубликована: Ноя. 13, 2024

Artificial intelligence advancements have the potential to transform medical education and patient care. The increasing popularity of large language models has raised important questions regarding their accuracy agreement with human users. purpose this study was evaluate performance Chat Generative Pre-Trained Transformer (ChatGPT), versions 3.5 4, as well Microsoft Copilot, which is powered by ChatGPT-4, on self-assessment examination for hand surgery compare results between versions. Input included 1,000 across 5 years (2015-2019) examinations provided American Society Surgery Hand. primary outcomes correctness, percentage concordance relative other users, whether an additional prompt required. Secondary according question type difficulty. All formats including image-based were used analysis. ChatGPT-3.5 correctly answered 51.6% ChatGPT-4 63.4%, a statistically significant difference. Copilot 59.9% outperformed but scored significantly lower than ChatGPT-4. However, sided average 72.2% users when correct 62.1% incorrect, compared 67.0% 53.2% respectively, 79.7% 52.1% incorrect. highest scoring subject Miscellaneous, lowest Neuromuscular in all In study, perform better subspecialty did ChatGPT-3.5. more accurate ChatGPT3.5 less ChatGPT4. able "pass" 2015-2019 Hand examinations. While holding promise within education, caution should be detailed evaluation consistency needed. Future studies explore how these multiple trials contexts truly assess reliability.

Язык: Английский

Процитировано

0