Systematic Review of Large Language Models for Patient Care: Current Applications and Challenges
medRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2024,
Volume and Issue:
unknown
Published: March 5, 2024
Abstract
The
introduction
of
large
language
models
(LLMs)
into
clinical
practice
promises
to
improve
patient
education
and
empowerment,
thereby
personalizing
medical
care
broadening
access
knowledge.
Despite
the
popularity
LLMs,
there
is
a
significant
gap
in
systematized
information
on
their
use
care.
Therefore,
this
systematic
review
aims
synthesize
current
applications
limitations
LLMs
using
data-driven
convergent
synthesis
approach.
We
searched
5
databases
for
qualitative,
quantitative,
mixed
methods
articles
published
between
2022
2023.
From
4,349
initial
records,
89
studies
across
29
specialties
were
included,
primarily
examining
based
GPT-3.5
(53.2%,
n=66
124
different
examined
per
study)
GPT-4
(26.6%,
n=33/124)
architectures
question
answering,
followed
by
generation,
including
text
summarization
or
translation,
documentation.
Our
analysis
delineates
two
primary
domains
LLM
limitations:
design
output.
Design
included
6
second-order
12
third-order
codes,
such
as
lack
domain
optimization,
data
transparency,
accessibility
issues,
while
output
9
32
example,
non-reproducibility,
non-comprehensiveness,
incorrectness,
unsafety,
bias.
In
conclusion,
study
first
systematically
map
care,
providing
foundational
framework
taxonomy
implementation
evaluation
healthcare
settings.
Language: Английский
Current applications and challenges in large language models for patient care: a systematic review
Communications Medicine,
Journal Year:
2025,
Volume and Issue:
5(1)
Published: Jan. 21, 2025
Abstract
Background
The
introduction
of
large
language
models
(LLMs)
into
clinical
practice
promises
to
improve
patient
education
and
empowerment,
thereby
personalizing
medical
care
broadening
access
knowledge.
Despite
the
popularity
LLMs,
there
is
a
significant
gap
in
systematized
information
on
their
use
care.
Therefore,
this
systematic
review
aims
synthesize
current
applications
limitations
LLMs
Methods
We
systematically
searched
5
databases
for
qualitative,
quantitative,
mixed
methods
articles
published
between
2022
2023.
From
4349
initial
records,
89
studies
across
29
specialties
were
included.
Quality
assessment
was
performed
using
Mixed
Appraisal
Tool
2018.
A
data-driven
convergent
synthesis
approach
applied
thematic
syntheses
LLM
free
line-by-line
coding
Dedoose.
Results
show
that
most
investigate
Generative
Pre-trained
Transformers
(GPT)-3.5
(53.2%,
n
=
66
124
different
examined)
GPT-4
(26.6%,
33/124)
answering
questions,
followed
by
generation,
including
text
summarization
or
translation,
documentation.
Our
analysis
delineates
two
primary
domains
limitations:
design
output.
Design
include
6
second-order
12
third-order
codes,
such
as
lack
domain
optimization,
data
transparency,
accessibility
issues,
while
output
9
32
example,
non-reproducibility,
non-comprehensiveness,
incorrectness,
unsafety,
bias.
Conclusions
This
maps
care,
providing
foundational
framework
taxonomy
implementation
evaluation
healthcare
settings.
Language: Английский
Utilizing large language models for gastroenterology research: a conceptual framework
Parul Berry,
No information about this author
Rohan Raju Dhanakshirur,
No information about this author
Sahil Khanna
No information about this author
et al.
Therapeutic Advances in Gastroenterology,
Journal Year:
2025,
Volume and Issue:
18
Published: Jan. 1, 2025
Large
language
models
(LLMs)
transform
healthcare
by
assisting
clinicians
with
decision-making,
research,
and
patient
management.
In
gastroenterology,
LLMs
have
shown
potential
in
clinical
decision
support,
data
extraction,
education.
However,
challenges
such
as
bias,
hallucinations,
integration
workflows,
regulatory
compliance
must
be
addressed
for
safe
effective
implementation.
This
manuscript
presents
a
structured
framework
integrating
into
using
Hepatitis
C
treatment
real-world
application.
The
outlines
key
steps
to
ensure
accuracy,
safety,
relevance
while
mitigating
risks
associated
artificial
intelligence
(AI)-driven
tools.
includes
defining
goals,
assembling
multidisciplinary
team,
collection
preparation,
model
selection,
fine-tuning,
calibration,
hallucination
mitigation,
user
interface
development,
electronic
health
records,
validation,
continuous
improvement.
Retrieval-augmented
generation
fine-tuning
approaches
are
evaluated
optimizing
adaptability.
Bias
detection,
reinforcement
learning
from
human
feedback,
prompt
engineering
incorporated
enhance
reliability.
Ethical
considerations,
including
the
Health
Insurance
Portability
Accountability
Act,
General
Data
Protection
Regulation,
AI-specific
guidelines
(DECIDE-AI,
SPIRIT-AI,
CONSORT-AI),
responsible
AI
deployment.
research
efficiency,
care
but
deployment
requires
bias
transparency,
ongoing
validation.
Future
should
focus
on
multi-institutional
validation
AI-assisted
trials
establish
reliable
tools
gastroenterology.
Language: Английский
A Brief Review on Benchmarking for Large Language Models Evaluation in Healthcare
Wiley Interdisciplinary Reviews Data Mining and Knowledge Discovery,
Journal Year:
2025,
Volume and Issue:
15(2)
Published: April 9, 2025
ABSTRACT
This
paper
reviews
benchmarking
methods
for
evaluating
large
language
models
(LLMs)
in
healthcare
settings.
It
highlights
the
importance
of
rigorous
to
ensure
LLMs'
safety,
accuracy,
and
effectiveness
clinical
applications.
The
review
also
discusses
challenges
developing
standardized
benchmarks
metrics
tailored
healthcare‐specific
tasks
such
as
medical
text
generation,
disease
diagnosis,
patient
management.
Ethical
considerations,
including
privacy,
data
security,
bias,
are
addressed,
underscoring
need
multidisciplinary
collaboration
establish
robust
frameworks
that
facilitate
reliable
ethical
use
healthcare.
Evaluation
LLMs
remains
challenging
due
lack
comprehensive
datasets.
Key
concerns
include
model
better
explainability,
all
which
impact
overall
trustworthiness
Language: Английский
Large Language Models in Gastroenterology: Systematic Review
Journal of Medical Internet Research,
Journal Year:
2024,
Volume and Issue:
26, P. e66648 - e66648
Published: Dec. 20, 2024
Background
As
health
care
continues
to
evolve
with
technological
advancements,
the
integration
of
artificial
intelligence
into
clinical
practices
has
shown
promising
potential
enhance
patient
and
operational
efficiency.
Among
forefront
these
innovations
are
large
language
models
(LLMs),
a
subset
designed
understand,
generate,
interact
human
at
an
unprecedented
scale.
Objective
This
systematic
review
describes
role
LLMs
in
improving
diagnostic
accuracy,
automating
documentation,
advancing
specialist
education
engagement
within
field
gastroenterology
gastrointestinal
endoscopy.
Methods
Core
databases
including
MEDLINE
through
PubMed,
Embase,
Cochrane
Central
registry
were
searched
using
keywords
related
(from
inception
April
2024).
Studies
included
if
they
satisfied
following
criteria:
(1)
any
type
studies
that
investigated
endoscopy
or
gastroenterology,
(2)
published
English,
(3)
full-text
format.
The
exclusion
criteria
as
follows:
did
not
report
case
reports
papers,
ineligible
research
objects
(eg,
animals
basic
research),
(4)
insufficient
data
regarding
LLMs.
Risk
Bias
Non-Randomized
Studies—of
Interventions
was
used
evaluate
quality
identified
studies.
Results
Overall,
21
on
disorders
review,
narrative
synthesis
done
because
heterogeneity
specified
aims
methodology
each
study.
overall
risk
bias
low
5
moderate
16
ability
spread
general
medical
information,
offer
advice
for
consultations,
generate
procedure
automatically,
draw
conclusions
about
presumptive
diagnosis
complex
illnesses
demonstrated
by
review.
Despite
benefits,
such
increased
efficiency
improved
outcomes,
challenges
privacy,
interdisciplinary
collaboration
remain.
Conclusions
We
highlight
importance
navigating
fully
leverage
transforming
practices.
Trial
Registration
PROSPERO
581772;
https://www.crd.york.ac.uk/prospero/
Language: Английский
Large Language Models in Gastroenterology: Systematic Review (Preprint)
Published: Sept. 19, 2024
BACKGROUND
As
health
care
continues
to
evolve
with
technological
advancements,
the
integration
of
artificial
intelligence
into
clinical
practices
has
shown
promising
potential
enhance
patient
and
operational
efficiency.
Among
forefront
these
innovations
are
large
language
models
(LLMs),
a
subset
designed
understand,
generate,
interact
human
at
an
unprecedented
scale.
OBJECTIVE
This
systematic
review
describes
role
LLMs
in
improving
diagnostic
accuracy,
automating
documentation,
advancing
specialist
education
engagement
within
field
gastroenterology
gastrointestinal
endoscopy.
METHODS
Core
databases
including
MEDLINE
through
PubMed,
Embase,
Cochrane
Central
registry
were
searched
using
keywords
related
(from
inception
April
2024).
Studies
included
if
they
satisfied
following
criteria:
(1)
any
type
studies
that
investigated
endoscopy
or
gastroenterology,
(2)
published
English,
(3)
full-text
format.
The
exclusion
criteria
as
follows:
did
not
report
case
reports
papers,
ineligible
research
objects
(eg,
animals
basic
research),
(4)
insufficient
data
regarding
LLMs.
Risk
Bias
Non-Randomized
Studies—of
Interventions
was
used
evaluate
quality
identified
studies.
RESULTS
Overall,
21
on
disorders
review,
narrative
synthesis
done
because
heterogeneity
specified
aims
methodology
each
study.
overall
risk
bias
low
5
moderate
16
ability
spread
general
medical
information,
offer
advice
for
consultations,
generate
procedure
automatically,
draw
conclusions
about
presumptive
diagnosis
complex
illnesses
demonstrated
by
review.
Despite
benefits,
such
increased
efficiency
improved
outcomes,
challenges
privacy,
interdisciplinary
collaboration
remain.
CONCLUSIONS
We
highlight
importance
navigating
fully
leverage
transforming
practices.
CLINICALTRIAL
PROSPERO
581772;
https://www.crd.york.ac.uk/prospero/
Language: Английский
The Potential Clinical Utility of the Customized Large Language Model in Gastroenterology: A Pilot Study
Bioengineering,
Journal Year:
2024,
Volume and Issue:
12(1), P. 1 - 1
Published: Dec. 24, 2024
Background:
The
large
language
model
(LLM)
has
the
potential
to
be
applied
clinical
practice.
However,
there
been
scarce
study
on
this
in
field
of
gastroenterology.
Aim:
This
explores
utility
two
LLMs
gastroenterology:
a
customized
GPT
and
conventional
GPT-4o,
an
advanced
LLM
capable
retrieval-augmented
generation
(RAG).
Method:
We
established
with
BM25
algorithm
using
Open
AI’s
GPT-4o
model,
which
allows
it
produce
responses
context
specific
documents
including
textbooks
internal
medicine
(in
English)
gastroenterology
Korean).
Also,
we
prepared
ChatGPT
4o
(accessed
16
October
2024)
access.
benchmark
(written
Korean)
consisted
15
questions
developed
by
four
experts,
representing
typical
for
medical
students.
LLMs,
fellow,
expert
gastroenterologist
were
tested
assess
their
performance.
Results:
While
correctly
answered
8
out
questions,
fellow
10
correctly.
When
standardized
Korean
terms
replaced
English
terminology,
LLM’s
performance
improved,
answering
additional
knowledge-based
correctly,
matching
fellow’s
score.
judgment-based
remained
challenge
model.
Even
implementation
‘Chain
Thought’
prompt
engineering,
did
not
achieve
improved
reasoning.
Conventional
achieved
highest
score
among
AI
models
(14/15).
Although
both
performed
slightly
below
gastroenterologist’s
level
(15/15),
they
show
promising
applications
(scores
comparable
or
higher
than
that
fellow).
Conclusions:
could
utilized
assist
specialized
tasks
such
as
patient
counseling.
RAG
capabilities
enabling
real-time
retrieval
external
data
included
training
dataset,
appear
essential
managing
complex,
content,
clinician
oversight
will
remain
crucial
ensure
safe
effective
use
Language: Английский