Reliability, Accuracy, and Comprehensibility of AI-Based Responses to Common Patient Questions Regarding Spinal Cord Stimulation
Journal of Clinical Medicine,
Год журнала:
2025,
Номер
14(5), С. 1453 - 1453
Опубликована: Фев. 21, 2025
Background:
Although
spinal
cord
stimulation
(SCS)
is
an
effective
treatment
for
managing
chronic
pain,
many
patients
have
understandable
questions
and
concerns
regarding
this
therapy.
Artificial
intelligence
(AI)
has
shown
promise
in
delivering
patient
education
healthcare.
This
study
evaluates
the
reliability,
accuracy,
comprehensibility
of
ChatGPT’s
responses
to
common
inquiries
about
SCS.
Methods:
Thirteen
commonly
asked
SCS
were
selected
based
on
authors’
clinical
experience
pain
a
targeted
review
materials
relevant
medical
literature.
The
prioritized
their
frequency
consultations,
relevance
decision-making
SCS,
complexity
information
typically
required
comprehensively
address
questions.
These
spanned
three
domains:
pre-procedural,
intra-procedural,
post-procedural
concerns.
Responses
generated
using
GPT-4.0
with
prompt
“If
you
physician,
how
would
answer
asking…”.
independently
assessed
by
10
physicians
two
non-healthcare
professionals
Likert
scale
reliability
(1–6
points),
accuracy
(1–3
points).
Results:
demonstrated
strong
(5.1
±
0.7)
(2.8
0.2),
92%
98%
responses,
respectively,
meeting
or
exceeding
our
predefined
thresholds.
Accuracy
was
2.7
0.3,
95%
rated
sufficiently
accurate.
General
queries,
such
as
“What
stimulation?”
are
risks
benefits?”,
received
higher
scores
compared
technical
like
different
types
waveforms
used
SCS?”.
Conclusions:
ChatGPT
can
be
implemented
supplementary
tool
education,
particularly
addressing
general
procedural
queries
However,
AI’s
performance
less
robust
highly
nuanced
Язык: Английский
Language Artificial Intelligence Models as Pioneers in Diagnostic Medicine? A Retrospective Analysis on Real-Time Patients
Journal of Clinical Medicine,
Год журнала:
2025,
Номер
14(4), С. 1131 - 1131
Опубликована: Фев. 10, 2025
Background/Objectives:
GPT-3.5
and
GPT-4
has
shown
promise
in
assisting
healthcare
professionals
with
clinical
questions.
However,
their
performance
real-time
scenarios
remains
underexplored.
This
study
aims
to
evaluate
precision
reliability
compared
board-certified
emergency
department
attendings,
highlighting
potential
improving
patient
care.
We
hypothesized
that
attendings
at
Maimonides
Medical
Center
exhibit
higher
accuracy
than
generating
differentials
based
on
history
physical
examination
for
patients
presenting
the
department.
Methods:
Real-time
data
from
Center’s
department,
collected
1
January
2023
March
were
analyzed.
Demographic
details,
symptoms,
medical
history,
discharge
diagnoses
recorded
by
room
examined.
AI
algorithms
(ChatGPT-3.5
GPT-4)
generated
differential
diagnoses,
which
those
attending
physicians.
Accuracy
was
determined
comparing
each
rater’s
gold
standard
diagnosis,
calculating
proportion
of
correctly
identified
cases.
Precision
assessed
using
Cohen’s
kappa
coefficient
Intraclass
Correlation
Coefficient
measure
agreement
between
raters.
Results:
Mean
age
49.12
years,
57.3%
males
42.7%
females.
Chief
complaints
included
fever/sepsis
(24.7%),
gastrointestinal
issues
(17.7%),
cardiovascular
problems
(16.4%).
Diagnostic
against
highest
ChatGPT-4
(85.5%),
followed
ChatGPT-3.5
(84.6%)
ED
(83%).
demonstrated
moderate
(0.7)
models,
lower
observed
attendings.
Stratified
analysis
revealed
Chat
(87.5%)
(81.34%).
Conclusions:
Our
demonstrates
comparable
diagnostic
aid
decision-making
dynamic
settings.
The
stratified
chat
bots
represents
a
significant
high-risk
provided
targeted
insights
into
rater
within
specific
domains.
contributes
integrating
models
practice,
enhancing
efficiency
effectiveness
decision-making.
Further
research
is
warranted
explore
broader
applications
healthcare.
Язык: Английский
An Assessment of ChatGPT’s Responses to Common Patient Questions About Lung Cancer Surgery: A Preliminary Clinical Evaluation of Accuracy and Relevance
Journal of Clinical Medicine,
Год журнала:
2025,
Номер
14(5), С. 1676 - 1676
Опубликована: Март 1, 2025
Background:
Chatbots
based
on
artificial
intelligence
(AI)
and
machine
learning
are
rapidly
growing
in
popularity.
Patients
may
use
these
technologies
to
ask
questions
regarding
surgical
interventions,
preoperative
assessments,
postoperative
outcomes.
The
aim
of
this
study
was
determine
whether
ChatGPT
could
appropriately
answer
some
the
most
frequently
asked
posed
by
patients
about
lung
cancer
surgery.
Methods:
Sixteen
surgery
were
chatbot
one
conversation,
without
follow-up
or
repetition
same
questions.
Each
evaluated
for
appropriateness
accuracy
using
an
evidence-based
approach
a
panel
specialists
with
relevant
clinical
experience.
responses
assessed
four-point
Likert
scale
(i.e.,
“strongly
agree,
satisfactory”,
“agree,
requires
minimal
clarification”,
“disagree,
moderate
disagree,
substantial
clarification”).
Results:
All
answers
provided
judged
be
satisfactory,
evidence-based,
generally
unbiased
overall,
seldomly
requiring
clarification.
Moreover,
information
delivered
language
deemed
easy-to-read
comprehensible
patients.
Conclusions:
effectively
provide
commonly
presented
considered
understandable
Therefore,
resource
valuable
adjunctive
tool
patient
education.
Язык: Английский
Effectiveness of Generative Artificial Intelligence-Driven Responses to Patient Concerns in Long-Term Opioid Therapy: Cross-Model Assessment
Biomedicines,
Год журнала:
2025,
Номер
13(3), С. 636 - 636
Опубликована: Март 5, 2025
Background:
While
long-term
opioid
therapy
is
a
widely
utilized
strategy
for
managing
chronic
pain,
many
patients
have
understandable
questions
and
concerns
regarding
its
safety,
efficacy,
potential
dependency
addiction.
Providing
clear,
accurate,
reliable
information
essential
fostering
patient
understanding
acceptance.
Generative
artificial
intelligence
(AI)
applications
offer
interesting
avenues
delivering
education
in
healthcare.
This
study
evaluates
the
reliability,
accuracy,
comprehensibility
of
ChatGPT’s
responses
to
common
inquiries
about
therapy.
Methods:
An
expert
panel
selected
thirteen
frequently
asked
based
on
authors’
clinical
experience
pain
targeted
review
materials.
Questions
were
prioritized
prevalence
consultations,
relevance
treatment
decision-making,
complexity
typically
required
address
them
comprehensively.
We
assessed
by
implementing
multimodal
generative
AI
Copilot
(Microsoft
365
Chat).
Spanning
three
domains—pre-therapy,
during
therapy,
post-therapy—each
question
was
submitted
GPT-4.0
with
prompt
“If
you
physician,
how
would
answer
asking…”.
Ten
physicians
two
non-healthcare
professionals
independently
using
Likert
scale
rate
reliability
(1–6
points),
accuracy
(1–3
points).
Results:
Overall,
demonstrated
high
(5.2
±
0.6)
good
(2.8
0.2),
most
answers
meeting
or
exceeding
predefined
thresholds.
Accuracy
moderate
(2.7
0.3),
lower
performance
more
technical
topics
like
tolerance
management.
Conclusions:
exhibit
significant
as
supplementary
tool
limitations
addressing
highly
context-specific
queries
underscore
need
ongoing
refinement
domain-specific
training.
Integrating
systems
into
practice
should
involve
collaboration
between
healthcare
developers
ensure
safe,
personalized,
up-to-date
Язык: Английский