Journal of Hand Surgery Global Online,
Journal Year:
2024,
Volume and Issue:
7(1), P. 23 - 28
Published: Nov. 13, 2024
Artificial
intelligence
advancements
have
the
potential
to
transform
medical
education
and
patient
care.
The
increasing
popularity
of
large
language
models
has
raised
important
questions
regarding
their
accuracy
agreement
with
human
users.
purpose
this
study
was
evaluate
performance
Chat
Generative
Pre-Trained
Transformer
(ChatGPT),
versions
3.5
4,
as
well
Microsoft
Copilot,
which
is
powered
by
ChatGPT-4,
on
self-assessment
examination
for
hand
surgery
compare
results
between
versions.
Input
included
1,000
across
5
years
(2015-2019)
examinations
provided
American
Society
Surgery
Hand.
primary
outcomes
correctness,
percentage
concordance
relative
other
users,
whether
an
additional
prompt
required.
Secondary
according
question
type
difficulty.
All
formats
including
image-based
were
used
analysis.
ChatGPT-3.5
correctly
answered
51.6%
ChatGPT-4
63.4%,
a
statistically
significant
difference.
Copilot
59.9%
outperformed
but
scored
significantly
lower
than
ChatGPT-4.
However,
sided
average
72.2%
users
when
correct
62.1%
incorrect,
compared
67.0%
53.2%
respectively,
79.7%
52.1%
incorrect.
highest
scoring
subject
Miscellaneous,
lowest
Neuromuscular
in
all
In
study,
perform
better
subspecialty
did
ChatGPT-3.5.
more
accurate
ChatGPT3.5
less
ChatGPT4.
able
"pass"
2015-2019
Hand
examinations.
While
holding
promise
within
education,
caution
should
be
detailed
evaluation
consistency
needed.
Future
studies
explore
how
these
multiple
trials
contexts
truly
assess
reliability.
Artificial Intelligence Surgery,
Journal Year:
2024,
Volume and Issue:
4(3), P. 214 - 32
Published: Sept. 2, 2024
Artificial
intelligence
(AI)
is
currently
utilized
across
numerous
medical
disciplines.
Nevertheless,
despite
its
promising
advancements,
AI’s
integration
in
hand
surgery
remains
early
stages
and
has
not
yet
been
widely
implemented,
necessitating
continued
research
to
validate
efficacy
ensure
safety.
Therefore,
this
review
aims
provide
an
overview
of
the
utilization
AI
surgery,
emphasizing
current
application
clinical
practice,
along
with
potential
benefits
associated
challenges.
A
comprehensive
literature
search
was
conducted
PubMed,
Embase,
Medline,
Cochrane
libraries,
adhering
Preferred
reporting
items
for
systematic
reviews
meta-analyses
(PRISMA)
guidelines.
The
focused
on
identifying
articles
related
utilizing
multiple
relevant
keywords.
Each
identified
article
assessed
based
title,
abstract,
full
text.
primary
1,228
articles;
after
inclusion/exclusion
criteria
manual
bibliography
included
articles,
a
total
98
were
covered
review.
wrist
diagnostic,
which
includes
fracture
detection,
carpal
tunnel
syndrome
(CTS),
avascular
necrosis
(AVN),
osteoporosis
screening.
Other
applications
include
residents’
training,
patient-doctor
communication,
surgical
assistance,
outcome
prediction.
Consequently,
very
tool
that
though
further
necessary
fully
integrate
it
into
practice.
Journal of Hand Surgery Global Online,
Journal Year:
2024,
Volume and Issue:
7(1), P. 23 - 28
Published: Nov. 13, 2024
Artificial
intelligence
advancements
have
the
potential
to
transform
medical
education
and
patient
care.
The
increasing
popularity
of
large
language
models
has
raised
important
questions
regarding
their
accuracy
agreement
with
human
users.
purpose
this
study
was
evaluate
performance
Chat
Generative
Pre-Trained
Transformer
(ChatGPT),
versions
3.5
4,
as
well
Microsoft
Copilot,
which
is
powered
by
ChatGPT-4,
on
self-assessment
examination
for
hand
surgery
compare
results
between
versions.
Input
included
1,000
across
5
years
(2015-2019)
examinations
provided
American
Society
Surgery
Hand.
primary
outcomes
correctness,
percentage
concordance
relative
other
users,
whether
an
additional
prompt
required.
Secondary
according
question
type
difficulty.
All
formats
including
image-based
were
used
analysis.
ChatGPT-3.5
correctly
answered
51.6%
ChatGPT-4
63.4%,
a
statistically
significant
difference.
Copilot
59.9%
outperformed
but
scored
significantly
lower
than
ChatGPT-4.
However,
sided
average
72.2%
users
when
correct
62.1%
incorrect,
compared
67.0%
53.2%
respectively,
79.7%
52.1%
incorrect.
highest
scoring
subject
Miscellaneous,
lowest
Neuromuscular
in
all
In
study,
perform
better
subspecialty
did
ChatGPT-3.5.
more
accurate
ChatGPT3.5
less
ChatGPT4.
able
"pass"
2015-2019
Hand
examinations.
While
holding
promise
within
education,
caution
should
be
detailed
evaluation
consistency
needed.
Future
studies
explore
how
these
multiple
trials
contexts
truly
assess
reliability.