medRxiv (Cold Spring Harbor Laboratory),
Год журнала:
2024,
Номер
unknown
Опубликована: Авг. 21, 2024
Abstract
Background
Generative
AI
models
that
can
produce
photorealistic
images
from
text
descriptions
have
many
applications
in
medicine,
including
medical
education
and
synthetic
data.
However,
it
be
challenging
to
evaluate
compare
their
range
of
heterogeneous
outputs,
thus
there
is
a
need
for
systematic
approach
enabling
image
model
comparisons.
Methods
We
develop
an
error
classification
system
annotating
errors
AI-generated
humans
apply
our
method
corpus
240
generated
with
three
different
(DALL-E
3,
Stable
Diffusion
XL
Cascade)
using
10
prompts
8
per
prompt.
The
identifies
five
types
severities
across
anatomical
regions
specifies
associated
quantitative
scoring
based
on
aggregated
proportions
expected
count
components
the
image.
assess
inter-rater
agreement
by
double-annotating
25%
calculating
Krippendorf’s
alpha
results
ten
quantitatively
cumulative
score
Findings
system,
accompanying
training
manual,
collection,
annotations,
all
scripts
are
available
GitHub
repository
at
https://github.com/hastingslab-org/ai-human-images
.
Inter-rater
was
relatively
poor,
reflecting
subjectivity
task.
Model
comparisons
revealed
DALL-E
3
performed
consistently
better
than
Diffusion,
however,
latter
more
diversity
personal
attributes.
Images
groups
people
were
individuals
or
pairs;
some
models.
Interpretation
Our
enables
comparison
humans;
serve
catalyse
improvements
these
applications.
Funding
This
study
received
support
University
Zurich’s
Digital
Society
Initiative,
Swiss
National
Science
Foundation
under
grant
209510.
Research
context
Evidence
before
this
authors
searched
PubMed
Google
Scholar
find
publications
evaluating
text-to-image
outputs
between
2014
(when
generative
adversarial
networks
first
become
available)
2024.
While
bulk
evaluations
focused
task-specific
generating
single
image,
few
emerged
exploring
novel
general-purpose
diffusion
broadly
data
generation.
no
previous
work
attempts
models’
representations
human
anatomy.
Added
value
present
prompt
state
art
large-scale
two
family.
Implications
evidence
comparisons,
remains
limited
labour-intensive
represented
figures.
Future
research
should
explore
automation
aspects
evaluation
through
coupled
segmentation
Journal of Cancer Survivorship,
Год журнала:
2025,
Номер
unknown
Опубликована: Март 1, 2025
Cancer
survivorship
begins
at
diagnosis
and
encompasses
a
wide
variety
of
experiences,
yet
prominent
societal
narratives
emphasize
positive,
post-treatment
"return-to-normal."
These
representations
shape
how
is
understood
experienced
by
cancer
survivors
the
public.
This
study
aimed
to
(1)
characterize
artificial
intelligence
(AI)–generated
images
(2)
compare
them
patients
understand
these
might
reflect
amplify
prevalent
narratives.
Two
AI
text-to-image
tools
(DALL-E,
Stable
Diffusion)
were
prompted
generate
40
each
(n
=
160
images).
Images
coded
for
perceived
demographics,
affect,
health,
markers
illness
or
cancer,
setting.
Chi-square
analyses
tested
differences
between
survivors.
Quantitative
data
complemented
coders'
qualitative
insights.
in
AI-generated
largely
as
White
(80%),
feminine
young
(51%),
happy
(69%),
healthy
many
observed
conform
Western
beauty
ideals.
Pink
(64%),
ribbons
(35%),
head
scarves
(51%)
visual
features
survivor
images.
Compared
patients,
more
frequently
featured
individuals
non-White
(p
.03),
<
.001),
affectively
positive
less
included
like
portraying
bed
.001)
medical
settings
.001).
fail
breadth
demographics
experience.
may
perpetuate
narrow
views
survivorship.
Journal of Clinical Medicine,
Год журнала:
2025,
Номер
14(7), С. 2136 - 2136
Опубликована: Март 21, 2025
Background:
Anatomically
accurate
illustrations
are
imperative
in
medical
education,
serving
as
crucial
tools
to
facilitate
comprehension
of
complex
anatomical
structures.
While
traditional
illustration
methods
involving
human
artists
remain
the
gold
standard,
rapid
advancement
Generative
Artificial
Intelligence
(GAI)
models
presents
a
new
opportunity
automate
and
accelerate
this
process.
This
study
evaluated
potential
GAI
produce
craniofacial
anatomy
for
educational
purposes.
Methods:
Four
models,
including
Midjourney
v6.0,
DALL-E
3,
Gemini
Ultra
1.0,
Stable
Diffusion
2.0
were
used
generate
736
images
across
multiple
views
surface
anatomy,
bones,
muscles,
blood
vessels,
nerves
cranium
both
oil
painting
realistic
photograph
styles.
reviewers
detail,
aesthetic
quality,
usability,
cost-effectiveness.
Inter-rater
reliability
analysis
assessed
evaluation
consistency.
Results:
v6.0
scored
highest
quality
cost-effectiveness,
3
performed
best
detail
usability.
The
inter-rater
demonstrated
high
level
agreement
among
(ICC
=
0.858,
95%
CI).
However,
all
showed
significant
flaws
depicting
details
such
foramina,
suture
lines,
muscular
origins/insertions,
neurovascular
These
limitations
further
characterized
by
abstract
depictions,
mixing
layers,
shadowing,
abnormal
muscle
arrangements,
labeling
errors.
Conclusions:
findings
highlight
GAI's
rapidly
creating
but
also
its
current
due
inadequate
training
data
incomplete
understanding
anatomy.
Refining
these
through
precise
expert
feedback
is
vital.
Ethical
considerations,
biases,
copyright
challenges,
risks
propagating
inaccurate
information,
must
be
carefully
navigated.
Further
refinement
ethical
safeguards
essential
safe
use.
This
study
aimed
to
task
and
assess
generative
artificial
intelligence
(AI)
models
in
creating
medical
illustrations
for
corneal
transplant
procedures
such
as
Descemet's
stripping
automated
endothelial
keratoplasty
(DSAEK),
membrane
(DMEK),
deep
anterior
lamellar
(DALK),
penetrating
(PKP).
Methods:
Six
engineered
prompts
were
provided
Decoder-Only
Autoregressive
Language
Image
Synthesis
3
(DALL-E
3)
Medical
Illustration
Manager
(MIM)
guide
these
AI
a
final
illustration
each
of
the
four
procedures.
Control
created
by
authors
technique
comparison.
A
grading
system
with
five
categories
maximum
score
points
(15
total)
was
designed
objectively
AI's
performance.
Four
independent
reviewers
analyzed
scored
images
produced
DALL-E
MIM
well
control
illustrations.
All
AI-generated
then
Chat
Generative
Pre-Trained
Transformer-4o
(ChatGPT-4o),
which
tasked
image
described
above.
results
tabulated
graphically
depicted.
Journal of Medical Internet Research,
Год журнала:
2024,
Номер
26, С. e60312 - e60312
Опубликована: Дек. 4, 2024
The
last
25
years
have
seen
enormous
progression
in
digital
technologies
across
the
whole
of
health
service,
including
education.
rapid
evolution
and
use
web-based
techniques
been
significantly
transforming
this
field
since
beginning
new
millennium.
These
advancements
continue
to
progress
swiftly,
even
more
so
after
COVID-19
pandemic.
To
utilize
artificial
intelligence
(AI)
platforms
to
generate
medical
illustrations
for
refractive
surgeries,
aiding
patients
in
visualizing
and
comprehending
procedures
like
laser-assisted
situ
keratomileusis
(LASIK),
photorefractive
keratectomy
(PRK),
small
incision
lenticule
extraction
(SMILE).
This
study
displays
the
current
performance
of
two
OpenAI
programs
terms
their
accuracy
common
corneal
procedures.
Research Square (Research Square),
Год журнала:
2024,
Номер
unknown
Опубликована: Июль 25, 2024
Abstract
The
wide
usage
of
artificial
intelligence
(AI)
text-to-image
generators
raises
concerns
about
the
role
AI
in
amplifying
misconceptions
healthcare.
This
study
therefore
evaluated
demographic
accuracy
and
potential
biases
depiction
patients
by
two
commonly
used
generators.
A
total
4,580
images
with
29
different
diseases
was
generated
using
Bing
Image
Generator
Meta
Imagine.
Eight
independent
raters
determined
sex,
age,
weight
group,
race
ethnicity
depicted.
Comparison
to
real-world
epidemiology
showed
that
failed
depict
demographical
characteristics
such
as
accurately.
In
addition,
we
observed
an
over-representation
White
well
normal
individuals.
Inaccuracies
may
stem
from
non-representative
non-specific
training
data
insufficient
or
misdirected
bias
mitigation
strategies.
consequence,
new
strategies
counteract
inaccuracies
are
needed.