Aesthetic Surgery Journal,
Journal Year:
2024,
Volume and Issue:
44(7), P. 769 - 778
Published: Feb. 14, 2024
Social
media
platforms
have
come
to
represent
integral
components
of
the
professional
marketing
and
advertising
strategy
for
plastic
surgeons.
Effective
consistent
content
development,
however,
remains
technically
demanding
time
consuming,
prompting
most
employ,
at
non-negligible
costs,
social
specialists
planning
development.
Healthcare,
Journal Year:
2024,
Volume and Issue:
12(8), P. 825 - 825
Published: April 13, 2024
Introduction:
As
large
language
models
receive
greater
attention
in
medical
research,
the
investigation
of
ethical
considerations
is
warranted.
This
review
aims
to
explore
surgery
literature
identify
concerns
surrounding
these
artificial
intelligence
and
evaluate
how
autonomy,
beneficence,
nonmaleficence,
justice
are
represented
within
discussions
provide
insights
order
guide
further
research
practice.
Methods:
A
systematic
was
conducted
accordance
with
Preferred
Reporting
Items
for
Systematic
Reviews
Meta-Analyses
guidelines.
Five
electronic
databases
were
searched
October
2023.
Eligible
studies
included
surgery-related
articles
that
focused
on
contained
adequate
discussion.
Study
details,
including
specialty
concerns,
collected.
Results:
The
search
yielded
1179
articles,
53
meeting
inclusion
criteria.
Plastic
surgery,
orthopedic
neurosurgery
most
surgical
specialties.
Autonomy
explicitly
cited
principle.
frequently
discussed
concern
accuracy
(n
=
45,
84.9%),
followed
by
bias,
patient
confidentiality,
responsibility.
Conclusion:
implications
using
complex
evolving.
integration
into
necessitates
continuous
discourse
ensure
responsible
use,
balancing
technological
advancement
human
dignity
safety.
Prostate Cancer and Prostatic Diseases,
Journal Year:
2024,
Volume and Issue:
unknown
Published: May 14, 2024
Abstract
Background
Generative
Pretrained
Model
(GPT)
chatbots
have
gained
popularity
since
the
public
release
of
ChatGPT.
Studies
evaluated
ability
different
GPT
models
to
provide
information
about
medical
conditions.
To
date,
no
study
has
assessed
quality
ChatGPT
outputs
prostate
cancer
related
questions
from
both
physician
and
perspective
while
optimizing
for
patient
consumption.
Methods
Nine
cancer-related
questions,
identified
through
Google
Trends
(Global),
were
categorized
into
diagnosis,
treatment,
postoperative
follow-up.
These
processed
using
3.5,
responses
recorded.
Subsequently,
these
re-inputted
create
simplified
summaries
understandable
at
a
sixth-grade
level.
Readability
original
layperson
was
validated
readability
tools.
A
survey
conducted
among
urology
providers
(urologists
urologists
in
training)
rate
accuracy,
completeness,
clarity
5-point
Likert
scale.
Furthermore,
two
independent
reviewers
on
correctness
trifecta:
decision-making
sufficiency.
Public
assessment
summaries’
understandability
carried
out
Amazon
Mechanical
Turk
(MTurk).
Participants
rated
demonstrated
their
understanding
multiple-choice
question.
Results
GPT-generated
output
deemed
correct
by
71.7%
94.3%
raters
(36
urologists,
17
residents)
across
9
scenarios.
this
as
accurate
8
(88.9%)
scenarios
sufficient
make
decision
Mean
higher
than
([original
v.
ChatGPT,
mean
(SD),
p
-value]
Flesch
Reading
Ease:
36.5(9.1)
70.2(11.2),
<0.0001;
Gunning
Fog:
15.8(1.7)
9.5(2.0),
<
0.0001;
Grade
Level:
12.8(1.2)
7.4(1.7),
Coleman
Liau:
13.7(2.1)
8.6(2.4),
0.0002;
Smog
index:
11.8(1.2)
6.7(1.8),
Automated
Index:
13.1(1.4)
7.5(2.1),
0.0001).
MTurk
workers
(
n
=
514)
(89.5–95.7%)
correctly
understood
content
(63.0–87.4%).
Conclusion
shows
promise
education
contents,
but
technology
is
not
designed
delivering
patients
information.
Prompting
model
respond
with
may
enhance
its
utility
when
used
GPT-powered
chatbots.
Journal of Clinical Medicine,
Journal Year:
2024,
Volume and Issue:
13(11), P. 3041 - 3041
Published: May 22, 2024
Background:
Large
language
models
(LLMs)
represent
a
recent
advancement
in
artificial
intelligence
with
medical
applications
across
various
healthcare
domains.
The
objective
of
this
review
is
to
highlight
how
LLMs
can
be
utilized
by
clinicians
and
surgeons
their
everyday
practice.
Methods:
A
systematic
was
conducted
following
the
Preferred
Reporting
Items
for
Systematic
Reviews
Meta-Analyses
guidelines.
Six
databases
were
searched
identify
relevant
articles.
Eligibility
criteria
emphasized
articles
focused
primarily
on
clinical
surgical
LLMs.
Results:
literature
search
yielded
333
results,
34
meeting
eligibility
criteria.
All
from
2023.
There
14
original
research
articles,
four
letters,
one
interview,
15
These
covered
wide
variety
specialties,
including
subspecialties.
Conclusions:
have
potential
enhance
delivery.
In
settings,
assist
diagnosis,
treatment
guidance,
patient
triage,
physician
knowledge
augmentation,
administrative
tasks.
documentation,
planning,
intraoperative
guidance.
However,
addressing
limitations
concerns,
particularly
those
related
accuracy
biases,
crucial.
should
viewed
as
tools
complement,
not
replace,
expertise
professionals.
Journal of Reconstructive Microsurgery,
Journal Year:
2024,
Volume and Issue:
40(09), P. 657 - 664
Published: Feb. 21, 2024
With
the
growing
relevance
of
artificial
intelligence
(AI)-based
patient-facing
information,
microsurgical-specific
online
information
provided
by
professional
organizations
was
compared
with
that
ChatGPT
(Chat
Generative
Pre-Trained
Transformer)
and
assessed
for
accuracy,
comprehensiveness,
clarity,
readability.
Informatics,
Journal Year:
2025,
Volume and Issue:
12(1), P. 9 - 9
Published: Jan. 17, 2025
The
rapid
advancement
of
large
language
models
like
ChatGPT
has
significantly
impacted
natural
processing,
expanding
its
applications
across
various
fields,
including
healthcare.
However,
there
remains
a
significant
gap
in
understanding
the
consistency
and
reliability
ChatGPT’s
performance
different
medical
domains.
We
conducted
this
systematic
review
according
to
an
LLM-assisted
PRISMA
setup.
high-recall
search
term
“ChatGPT”
yielded
1101
articles
from
2023
onwards.
Through
dual-phase
screening
process,
initially
automated
via
subsequently
manually
by
human
reviewers,
128
studies
were
included.
covered
range
specialties,
focusing
on
diagnosis,
disease
management,
patient
education.
assessment
metrics
varied,
but
most
compared
accuracy
against
evaluations
clinicians
or
reliable
references.
In
several
areas,
demonstrated
high
accuracy,
underscoring
effectiveness.
some
contexts
revealed
lower
accuracy.
mixed
outcomes
domains
emphasize
challenges
opportunities
integrating
AI
into
certain
areas
suggests
that
substantial
utility,
yet
inconsistent
all
indicates
need
for
ongoing
evaluation
refinement.
This
highlights
potential
improve
healthcare
delivery
alongside
necessity
continued
research
ensure
reliability.
Diagnostics,
Journal Year:
2025,
Volume and Issue:
15(5), P. 587 - 587
Published: Feb. 28, 2025
Background:
Dupuytren's
fibroproliferative
disease
affecting
the
hand's
palmar
fascia
leads
to
progressive
finger
contractures
and
functional
limitations.
Management
of
this
condition
relies
heavily
on
expertise
hand
surgeons,
who
tailor
interventions
based
clinical
assessment.
With
growing
interest
in
artificial
intelligence
(AI)
medical
decision-making,
study
aims
evaluate
feasibility
integrating
AI
into
management
by
comparing
AI-generated
recommendations
with
those
expert
surgeons.
Methods:
This
multicentric
comparative
involved
three
experienced
surgeons
five
systems
(ChatGPT,
Gemini,
Perplexity,
DeepSeek,
Copilot).
Twenty-two
standardized
prompts
representing
various
scenarios
were
used
assess
decision-making.
Surgeons
provided
recommendations,
which
analyzed
for
concordance,
rationale,
predicted
outcomes.
Key
metrics
included
union
accuracy,
surgeon
agreement,
precision,
recall,
F1
scores.
The
also
evaluated
performance
unanimous
versus
non-unanimous
cases
inter-AI
agreements.
Results:
Gemini
ChatGPT
demonstrated
highest
accuracy
(86.4%
81.8%,
respectively),
while
Copilot
showed
lowest
(40.9%).
Surgeon
agreement
was
(45.5%)
(42.4%).
performed
better
(accuracy
up
92.0%)
than
as
low
35.0%).
Inter-AI
agreements
ranged
from
75.0%
(ChatGPT-Gemini)
48.0%
(DeepSeek-Copilot).
Precision,
scores
consistently
higher
other
systems.
Conclusions:
systems,
particularly
ChatGPT,
show
promise
aligning
surgical
especially
straightforward
cases.
However,
significant
variability
exists,
complex
scenarios.
should
be
viewed
complementary
judgment,
requiring
further
refinement
validation
integration
practice.
Journal of Clinical Medicine,
Journal Year:
2024,
Volume and Issue:
13(10), P. 2832 - 2832
Published: May 11, 2024
Background:
OpenAI's
ChatGPT
(San
Francisco,
CA,
USA)
and
Google's
Gemini
(Mountain
View,
are
two
large
language
models
that
show
promise
in
improving
expediting
medical
decision
making
hand
surgery.
Evaluating
the
applications
of
these
within
field
surgery
is
warranted.
This
study
aims
to
evaluate
ChatGPT-4
classifying
injuries
recommending
treatment.
Methods:
were
given
68
fictionalized
clinical
vignettes
twice.
The
asked
use
a
specific
classification
system
recommend
surgical
or
nonsurgical
Classifications
scored
based
on
correctness.
Results
analyzed
using
descriptive
statistics,
paired
two-tailed
t-test,
sensitivity
testing.
Results:
Gemini,
correctly
70.6%
injuries,
demonstrated
superior
ability
over
(mean
score
1.46
vs.
0.87,
p-value
<
0.001).
For
management,
higher
intervention
compared
(98.0%
88.8%),
but
lower
specificity
(68.4%
94.7%).
When
ChatGPT,
greater
response
replicability.
Conclusions:
Large
like
assisting
making,
particularly
surgery,
with
generally
outperforming
ChatGPT.
These
findings
emphasize
importance
considering
strengths
limitations
different
when
integrating
them
into
practice.
Medicina,
Journal Year:
2024,
Volume and Issue:
60(6), P. 957 - 957
Published: June 8, 2024
Background
and
Objectives:
Large
language
models
(LLMs)
are
emerging
as
valuable
tools
in
plastic
surgery,
potentially
reducing
surgeons’
cognitive
loads
improving
patients’
outcomes.
This
study
aimed
to
assess
compare
the
current
state
of
two
most
common
readily
available
LLMs,
Open
AI’s
ChatGPT-4
Google’s
Gemini
Pro
(1.0
Pro),
providing
intraoperative
decision
support
reconstructive
surgery
procedures.
Materials
Methods:
We
presented
each
LLM
with
32
independent
scenarios
spanning
5
utilized
a
5-point
3-point
Likert
scale
for
medical
accuracy
relevance,
respectively.
determined
readability
responses
using
Flesch–Kincaid
Grade
Level
(FKGL)
Flesch
Reading
Ease
(FRE)
score.
Additionally,
we
measured
models’
response
time.
compared
performance
Mann–Whitney
U
test
Student’s
t-test.
Results:
significantly
outperformed
accurate
(3.59
±
0.84
vs.
3.13
0.83,
p-value
=
0.022)
relevant
(2.28
0.77
1.88
0.032)
responses.
Alternatively,
provided
more
concise
readable
responses,
an
average
FKGL
(12.80
1.56)
lower
than
ChatGPT-4′s
(15.00
1.89)
(p
<
0.0001).
However,
there
was
no
difference
FRE
scores
0.174).
Moreover,
Gemini’s
time
faster
(8.15
1.42
s)
ChatGPT’-4′s
(13.70
2.87
Conclusions:
Although
both
demonstrated
potential
tools.
Nevertheless,
their
inconsistency
across
different
procedures
underscores
need
further
training
optimization
ensure
reliability
decision-support
Plastic & Reconstructive Surgery Global Open,
Journal Year:
2024,
Volume and Issue:
12(2), P. e5580 - e5580
Published: Feb. 1, 2024
Background:
Given
the
dialogistic
properties
of
ChatGPT,
we
hypothesized
that
this
artificial
intelligence
(AI)
function
can
be
used
as
a
self-service
tool
where
clinical
questions
directly
answered
by
AI.
Our
objective
was
to
assess
content,
accuracy,
and
accessibility
AI-generated
content
regarding
common
perioperative
for
reduction
mammaplasty.
Methods:
ChatGPT
(OpenAI,
February
Version,
San
Francisco,
Calif.)
query
20
patient
concerns
arise
in
period
Searches
were
performed
duplicate
both
general
term
specific
question.
Query
outputs
analyzed
objectively
subjectively.
Descriptive
statistics,
t
tests,
chi-square
tests
appropriate
with
predetermined
level
significance
P
less
than
0.05.
Results:
From
total
40
outputs,
mean
word
length
191.8
words.
Readability
at
thirteenth
grade
level.
Regarding
all
97.5%
on
topic.
Medical
advice
deemed
reasonable
100%
cases.
General
queries
more
frequently
reported
overarching
background
information,
whereas
prescriptive
information
(
<
0.0001).
AI
specifically
recommended
following
surgeon
provided
postoperative
instructions
82.5%
instances.
Conclusions:
Currently
available
tools,
their
nascent
form,
provide
recommendations
With
further
calibration,
interfaces
may
serve
fielding
future;
however,
patients
must
always
retain
ability
bypass
technology
able
contact
surgeon.