Current Opinion in Ophthalmology,
Journal Year:
2024,
Volume and Issue:
unknown
Published: Aug. 27, 2024
Purpose
of
review
Vision
Language
Models
are
an
emerging
paradigm
in
artificial
intelligence
that
offers
the
potential
to
natively
analyze
both
image
and
textual
data
simultaneously,
within
a
single
model.
The
fusion
these
two
modalities
is
particular
relevance
ophthalmology,
which
has
historically
involved
specialized
imaging
techniques
such
as
angiography,
optical
coherence
tomography,
fundus
photography,
while
also
interfacing
with
electronic
health
records
include
free
text
descriptions.
This
then
surveys
fast-evolving
field
they
apply
current
ophthalmologic
research
practice.
Recent
findings
Although
models
incorporating
have
long
provenance
effective
multimodal
recent
development
exploiting
advances
technologies
transformer
autoencoder
models.
Summary
offer
assist
streamline
existing
clinical
workflow
whether
previsit,
during,
or
post-visit.
There
are,
however,
important
challenges
be
overcome,
particularly
regarding
patient
privacy
explainability
model
recommendations.
British Journal of Ophthalmology,
Journal Year:
2024,
Volume and Issue:
108(10), P. 1335 - 1340
Published: June 26, 2024
The
rapid
advancements
in
generative
artificial
intelligence
are
set
to
significantly
influence
the
medical
sector,
particularly
ophthalmology.
Generative
adversarial
networks
and
diffusion
models
enable
creation
of
synthetic
images,
aiding
development
deep
learning
tailored
for
specific
imaging
tasks.
Additionally,
advent
multimodal
foundational
models,
capable
generating
text
videos,
presents
a
broad
spectrum
applications
within
These
range
from
enhancing
diagnostic
accuracy
improving
patient
education
training
healthcare
professionals.
Despite
promising
potential,
this
area
technology
is
still
its
infancy,
there
several
challenges
be
addressed,
including
data
bias,
safety
concerns
practical
implementation
these
technologies
clinical
settings.
Ophthalmology and Therapy,
Journal Year:
2025,
Volume and Issue:
unknown
Published: Feb. 22, 2025
Effective
management
of
pediatric
myopia,
which
includes
treatments
like
corrective
lenses
and
low-dose
atropine,
requires
accurate
clinical
decisions.
However,
the
complexity
refractive
data,
such
as
variations
in
visual
acuity,
axial
length,
patient-specific
factors,
pose
challenges
to
determining
optimal
treatment.
This
study
aims
evaluate
performance
three
large
language
models
analyzing
these
data.
A
dataset
100
records,
including
parameters
acuity
was
analyzed
using
ChatGPT-3.5,
ChatGPT-4o,
Wenxin
Yiyan,
respectively.
Each
model
tasked
with
whether
intervention
needed
subsequently
recommending
a
treatment
(eyeglasses,
orthokeratology
lens,
or
atropine).
The
recommendations
were
compared
professional
optometrists'
consensus,
rated
on
1–5
Global
Quality
Score
(GQS)
scale,
evaluated
for
safety
utilizing
three-tier
accuracy
assessment.
ChatGPT-4o
outperformed
both
ChatGPT-3.5
Yiyan
needs,
an
90%,
significantly
higher
than
(p
<
0.05).
It
also
achieved
highest
GQS
4.4
±
0.55,
surpassing
other
0.001),
85%
responses
"good"
ahead
(82%)
(74%).
made
only
eight
errors
interventions,
fewer
(12)
(15).
Additionally,
it
performed
better
incomplete
abnormal
maintaining
quality
scores.
showed
safety,
making
promising
tool
decision
support
ophthalmology,
although
expert
oversight
is
still
necessary.
Eye,
Journal Year:
2025,
Volume and Issue:
unknown
Published: April 15, 2025
Abstract
Purpose
Large
language
models
have
shown
promise
in
answering
questions
related
to
medical
conditions.
This
study
evaluated
the
responses
of
ChatGPT-4
patient-centred
frequently
asked
(FAQs)
relevant
age-related
macular
degeneration
(AMD).
Methods
Ten
experts
across
a
range
clinical,
education
and
research
practices
optometry
ophthalmology.
Over
200
patient-centric
FAQs
from
authoritative
professional
society,
hospital
advocacy
websites
were
condensed
into
37
four
themes:
definition,
causes
risk
factors,
symptoms
detection,
treatment
follow-up.
The
individually
input
generate
responses.
graded
by
using
5-point
Likert
scale
(1
=
strongly
disagree;
5
agree)
domains:
coherency,
factuality,
comprehensiveness,
safety.
Results
Across
all
themes
domains,
median
scores
4
(“agree”).
Comprehensiveness
had
lowest
domains
(mean
3.8
±
0.8),
followed
factuality
3.9
safety
4.1
0.8)
coherency
4.3
0.7).
Examination
individual
showed
that
(14%),
21
(57%),
23
(62%)
9
(24%)
average
below
(below
“agree”)
for
comprehensiveness
respectively.
Free-text
comments
highlighted
issues
superseded
or
older
technologies,
techniques
are
not
routinely
used
clinical
practice,
such
as
genetic
testing.
Conclusions
AMD
generally
agreeable
terms
However,
areas
weakness
identified,
precluding
recommendations
routine
use
provide
patients
with
tailored
counselling
AMD.
Clinical Chemistry and Laboratory Medicine (CCLM),
Journal Year:
2025,
Volume and Issue:
unknown
Published: April 18, 2025
Abstract
Objectives
Accurate
medical
laboratory
reports
are
essential
for
delivering
high-quality
healthcare.
Recently,
advanced
artificial
intelligence
models,
such
as
those
in
the
ChatGPT
series,
have
shown
considerable
promise
this
domain.
This
study
assessed
performance
of
specific
GPT
models-namely,
4o,
o1,
and
o1
mini-in
identifying
errors
within
providing
treatment
recommendations.
Methods
In
retrospective
study,
86
Nucleic
acid
test
report
seven
upper
respiratory
tract
pathogens
were
compiled.
There
285
from
four
common
error
categories
intentionally
randomly
introduced
into
generated
incorrected
reports.
models
tasked
with
detecting
these
errors,
using
three
senior
scientists
(SMLS)
interns
(MLI)
control
groups.
Additionally,
generating
accurate
reliable
recommendations
following
positive
outcomes
based
on
corrected
χ2
tests,
Kruskal-Wallis
Wilcoxon
tests
used
statistical
analysis
where
appropriate.
Results
comparison
SMLS
or
MLI,
accurately
detected
types,
average
detection
rates
88.9
%(omission),
91.6
%
(time
sequence),
91.7
(the
same
individual
acted
both
inspector
reviewer).
However,
rate
result
input
format
by
was
only
51.9
%,
indicating
a
relatively
poor
aspect.
exhibited
substantial
to
almost
perfect
agreement
total
(kappa
[min,
max]:
0.778,
0.837).
between
MLI
moderately
lower
0.632,
0.696).
When
it
comes
reading
all
reports,
showed
obviously
reduced
time
compared
(all
p<0.001).
Notably,
our
also
found
GPT-o1
mini
model
had
better
consistency
identification
than
model,
which
that
GPT-4o
model.
The
pairwise
comparisons
model’s
outputs
across
repeated
runs
0.912,
0.996).
GPT-o1(all
significantly
outperformed
p<0.0001).
Conclusions
capability
some
accuracy
reliability
competent,
especially,
potentially
reducing
work
hours
enhancing
clinical
decision-making.
Ophthalmology and Therapy,
Journal Year:
2024,
Volume and Issue:
unknown
Published: Nov. 8, 2024
Cataracts
are
a
significant
cause
of
blindness.
While
individuals
frequently
turn
to
the
Internet
for
medical
advice,
distinguishing
reliable
information
can
be
challenging.
Large
language
models
(LLMs)
have
attracted
attention
generating
accurate,
human-like
responses
that
may
used
consultation.
However,
comprehensive
assessment
LLMs'
accuracy
within
specific
domains
is
still
lacking.
Journal of Clinical Medicine,
Journal Year:
2024,
Volume and Issue:
13(21), P. 6512 - 6512
Published: Oct. 30, 2024
This
study
evaluates
the
ability
of
six
popular
chatbots;
ChatGPT-3.5,
ChatGPT-4.0,
Gemini,
Copilot,
Chatsonic,
and
Perplexity,
to
provide
reliable
answers
questions
concerning
keratoconus.