irAE-GPT: Leveraging large language models to identify immune-related adverse events in electronic health records and clinical trial datasets
Cosmin A. Bejan,
No information about this author
Michelle Wang,
No information about this author
Sriram Venkateswaran
No information about this author
et al.
medRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2025,
Volume and Issue:
unknown
Published: March 6, 2025
Abstract
Background
Large
language
models
(LLMs)
have
emerged
as
transformative
technologies,
revolutionizing
natural
understanding
and
generation
across
various
domains,
including
medicine.
In
this
study,
we
investigated
the
capabilities,
limitations,
generalizability
of
Generative
Pre-trained
Transformer
(GPT)
in
analyzing
unstructured
patient
notes
from
large
healthcare
datasets
to
identify
immune-related
adverse
events
(irAEs)
associated
with
use
immune
checkpoint
inhibitor
(ICI)
therapy.
Methods
We
evaluated
performance
GPT-3.5,
GPT-4,
GPT-4o
on
manually
annotated
patients
receiving
ICI
therapy,
sampled
two
electronic
health
record
(EHR)
systems
seven
clinical
trials.
A
zero-shot
prompt
was
designed
exhaustively
irAEs
at
level
(main
analysis)
note
(secondary
analysis).
The
LLM-based
system
followed
a
multi-label
classification
approach
any
combination
individual
or
notes.
System
evaluation
conducted
for
each
available
irAE
well
broader
categories
classified
organ
level.
Results
Our
analysis
included
442
three
institutions.
most
common
identified
pneumonitis
(N=64),
colitis
(N=56),
rash
(N=32),
hepatitis
(N=28).
Overall,
GPT
achieved
high
sensitivity
specificity
but
only
moderate
positive
predictive
values,
reflecting
potential
bias
towards
overpredicting
outcomes.
highest
F1
micro-averaged
scores
both
patient-level
note-level
evaluations.
Highest
observed
hematological
(F1
range=1.0-1.0),
gastrointestinal
range=0.81-0.85),
musculoskeletal
rheumatologic
range=0.67-1.0)
categories.
Error
uncovered
substantial
limitations
handling
textual
causation,
where
should
not
be
accurately
text
also
causally
linked
inhibitors.
Conclusion
demonstrated
generalizable
abilities
identifying
EHRs
trial
reports.
Using
automate
event
detection
will
reduce
burden
physicians
professionals
by
eliminating
need
manual
review.
This
strengthen
safety
monitoring
lead
improved
care.
Language: Английский
Mapping artificial intelligence models in emergency medicine: A scoping review on artificial intelligence performance in emergency care and education
Turkish Journal of Emergency Medicine,
Journal Year:
2025,
Volume and Issue:
25(2), P. 67 - 91
Published: April 1, 2025
Artificial
intelligence
(AI)
is
increasingly
improving
the
processes
such
as
emergency
patient
care
and
medicine
education.
This
scoping
review
aims
to
map
use
performance
of
AI
models
in
regarding
concepts.
The
findings
show
that
AI-based
medical
imaging
systems
provide
disease
detection
with
85%-90%
accuracy
techniques
X-ray
computed
tomography
scans.
In
addition,
AI-supported
triage
were
found
be
successful
correctly
classifying
low-
high-urgency
patients.
education,
large
language
have
provided
high
rates
evaluating
exams.
However,
there
are
still
challenges
integration
into
clinical
workflows
model
generalization
capacity.
These
demonstrate
potential
updated
models,
but
larger-scale
studies
needed.
Language: Английский