European Journal of Therapeutics,
Journal Year:
2024,
Volume and Issue:
30(6), P. 844 - 849
Published: Dec. 31, 2024
Objective:
Artificial
Intelligence
(AI)
offers
opportunities
for
radiologists
to
enhance
workflow
efficiency,
perform
faster
and
repeatable
segmentation,
detect
lesions
more
easily.
The
aim
of
this
study
is
investigate
the
current
knowledge
general
attitudes
radiology
resident
physicians
towards
AI.
Additionally,
it
seeks
assess
state
AI/ML/DL
education
in
residency,
awareness
use
available
educational
resources.
Methods:
A
cross-sectional
was
conducted
using
an
online
survey
from
October
2023
February
2024.
included
demographic
data,
AI
knowledge,
AI,
role
medical
education.
Survey
questions
were
developed
based
on
literature
reviewed
by
experts
radiology.
Results:
155
participants
(38.7%
female)
with
average
age
28.81±4.77
years.
About
80.6%
aware
terms,
a
mean
score
3.02±1.39
7-point
Likert
scale.
Most
(90.3%)
had
no
programming
knowledge.
Only
22.6%
used
tools
occasionally.
majority
(73.4%)
believed
would
change
radiology's
future,
though
only
10.3%
felt
radiologists'
jobs
at
risk.
Regarding
education,
84.5%
reported
formal
training,
resources
low.
Conclusion:
found
that
while
among
residents
high,
their
practical
are
limited.
largely
absent
residency
programs,
These
findings
highlight
need
integrating
training
into
increasing
Frontiers in Medicine,
Journal Year:
2024,
Volume and Issue:
11
Published: Oct. 29, 2024
Large
Language
Models
(LLMs)
are
sophisticated
algorithms
that
analyze
and
generate
vast
amounts
of
textual
data,
mimicking
human
communication.
Notable
LLMs
include
GPT-4o
by
Open
AI,
Claude
3.5
Sonnet
Anthropic,
Gemini
Google.
This
scoping
review
aims
to
synthesize
the
current
applications
potential
uses
in
patient
education
engagement.
BMC Oral Health,
Journal Year:
2025,
Volume and Issue:
25(1)
Published: Feb. 1, 2025
This
study
evaluates
and
compares
the
performance
of
ChatGPT-3.5,
ChatGPT-4
Omni
(4o),
Google
Bard,
Microsoft
Copilot
in
responding
to
text-based
multiple-choice
questions
related
oral
radiology,
as
featured
Dental
Specialty
Admission
Exam
conducted
Türkiye.
A
collection
was
sourced
from
open-access
question
bank
Turkish
Exam,
covering
years
2012
2021.
The
included
123
questions,
each
with
five
options
one
correct
answer.
accuracy
levels
ChatGPT-4o,
were
compared
using
descriptive
statistics,
Kruskal-Wallis
test,
Dunn's
post
hoc
Cochran's
Q
test.
responses
generated
by
four
chatbots
exhibited
statistically
significant
differences
(p
=
0.000).
ChatGPT-4o
achieved
highest
at
86.1%,
followed
Bard
61.8%.
ChatGPT-3.5
demonstrated
an
rate
43.9%,
while
recorded
a
41.5%.
showcases
superior
advanced
reasoning
capabilities,
positioning
it
promising
educational
tool.
With
regular
updates,
has
potential
serve
reliable
source
information
for
both
healthcare
professionals
general
public.
Not
applicable.
Medicine,
Journal Year:
2025,
Volume and Issue:
104(2), P. e41059 - e41059
Published: Jan. 10, 2025
This
study
evaluates
the
efficacy
of
GPT-4,
a
Large
Language
Model,
in
simplifying
medical
literature
for
enhancing
patient
comprehension
glaucoma
care.
GPT-4
was
used
to
transform
published
abstracts
from
3
journals
(n
=
62)
and
education
materials
(Patient
Educational
Model
[PEMs],
n
9)
5th-grade
reading
level.
also
prompted
generate
de
novo
educational
outputs
at
6
different
levels
(5th
Grade,
8th
High
School,
Associate’s,
Bachelor’s
Doctorate).
Readability
both
transformed
quantified
using
Flesch
Kincaid
Grade
Level
(FKGL)
Reading
Ease
(FKRE)
Score.
Latent
semantic
analysis
(LSA)
cosine
similarity
applied
assess
content
consistency
materials.
The
transformation
resulted
FKGL
decreasing
by
an
average
3.21
points
(30%,
P
<
.001)
FKRE
increasing
28.6
(66%,
.001).
For
PEMs,
decreased
2.38
(28%,
.0272)
increased
12.14
(19%,
.0459).
LSA
revealed
high
consistency,
with
0.861
across
all
0.937
signifying
topical
themes
were
quantitatively
shown
be
consistent.
shows
that
effectively
simplifies
information
about
glaucoma,
making
it
more
accessible
while
maintaining
textual
content.
improved
readability
scores
generated
demonstrate
its
usefulness
levels.
Research on Social Work Practice,
Journal Year:
2025,
Volume and Issue:
unknown
Published: Jan. 17, 2025
Purpose
This
study
examines
the
comparative
efficacy
of
three
AI
large
language
models
(LLMs)—ChatGPT-4,
Gemini,
and
Microsoft
Copilot—in
clinical
social
work.
Method
By
presenting
scenarios
varying
complexities,
assessed
their
performance
using
Ateşman
Readability
Index
a
Likert-type
accuracy
scale.
Results
showed
that
Gemini
had
highest
accuracy,
while
Copilot
excelled
in
readability.
Significant
differences
were
found
scores
(
p
=
.003),
although
readability
not
statistically
significant
.054).
No
correlation
was
between
case
complexity
either
or
Discussion
Despite
differences,
none
fully
met
all
standards,
indicating
areas
for
further
improvement.
The
findings
suggest
LLMs
offer
promise
work,
they
require
refinement
to
better
meet
field's
needs.
Canadian Journal of Ophthalmology,
Journal Year:
2025,
Volume and Issue:
unknown
Published: Feb. 1, 2025
To
evaluate
the
performance
of
large
language
models
(LLMs),
specifically
Microsoft
Copilot,
GPT-4
(GPT-4o
and
GPT-4o
mini),
Google
Gemini
(Gemini
Advanced),
in
answering
ophthalmological
questions
assessing
impact
prompting
techniques
on
their
accuracy.
Prospective
qualitative
study.
Advanced).
A
total
300
from
StatPearls
were
tested,
covering
a
range
subspecialties
image-based
tasks.
Each
question
was
evaluated
using
2
techniques:
zero-shot
forced
(prompt
1)
combined
role-based
plan-and-solve+
2).
With
prompting,
demonstrated
significantly
superior
overall
performance,
correctly
72.3%
outperforming
all
other
models,
including
Copilot
(53.7%),
mini
(62.0%),
(54.3%),
Advanced
(62.0%)
(p
<
0.0001).
Both
showed
notable
improvements
with
Prompt
over
1,
elevating
Copilot's
accuracy
lowest
(53.7%)
to
second
highest
(72.3%)
among
LLMs.
While
newer
iterations
LLMs,
such
as
Advanced,
outperformed
less
advanced
counterparts
Gemini),
this
study
emphasizes
need
for
caution
clinical
applications
these
models.
The
choice
influences
highlighting
necessity
further
research
refine
LLMs
capabilities,
particularly
visual
data
interpretation,
ensure
safe
integration
into
medical
practice.
European Journal of Therapeutics,
Journal Year:
2025,
Volume and Issue:
31(1), P. 28 - 34
Published: Feb. 28, 2025
Objectives:
The
aim
of
this
study
is
to
compare
the
ability
artificial
intelligence-based
chatbots,
ChatGPT-4o
and
Claude
3.5,
interpret
mammography
images.
focuses
on
evaluating
their
accuracy
consistency
in
BI-RADS
classification
breast
parenchymal
type
assessment.
It
also
aims
explore
potential
these
technologies
reduce
radiologists’
workload
identify
limitations
medical
image
analysis.
Methods:
A
total
53
images
obtained
between
January
July
2024
were
analyzed,
focusing
same
anonymized
provided
both
chatbots
under
identical
prompts.
Results:
results
showed
rates
for
ranging
from
18.87%
26.42%
18.7%
3.5.
When
categories
grouped
into
benign
group(BI-RADS
1,2)
malignant
4,5),
combined
was
57.5%
(initial
evaluation)
55%
(second
evaluation),
compared
47.5%
Breast
30.19%
22.64%
ChatGPT-4o,
Conclusions:
findings
indicate
that
demonstrate
limited
reliability
interpreting
These
highlight
need
further
optimization,
larger
datasets,
advanced
training
processes
improve
performance
Pediatric Nephrology,
Journal Year:
2025,
Volume and Issue:
unknown
Published: March 5, 2025
Artificial
intelligence
(AI)
has
emerged
as
a
transformative
tool
in
healthcare,
offering
significant
advancements
providing
accurate
clinical
information.
However,
the
performance
and
applicability
of
AI
models
specialized
fields
such
pediatric
nephrology
remain
underexplored.
This
study
is
aimed
at
evaluating
ability
two
AI-based
language
models,
GPT-3.5
GPT-4,
to
provide
reliable
information
nephrology.
The
were
evaluated
on
four
criteria:
accuracy,
scope,
patient
friendliness,
applicability.
Forty
specialists
with
≥
5
years
experience
rated
GPT-4
responses
10
questions
using
1-5
scale
via
Google
Forms.
Ethical
approval
was
obtained,
informed
consent
secured
from
all
participants.
Both
demonstrated
comparable
across
criteria,
no
statistically
differences
observed
(p
>
0.05).
exhibited
slightly
higher
mean
scores
parameters,
but
negligible
(Cohen's
d
<
0.1
for
criteria).
Reliability
analysis
revealed
low
internal
consistency
both
(Cronbach's
alpha
ranged
between
0.019
0.162).
Correlation
indicated
relationship
participants'
professional
their
evaluations
(correlation
coefficients
-
0.026
0.074).
While
provided
foundational
level
support,
neither
model
superior
addressing
unique
challenges
findings
highlight
need
domain-specific
training
integration
updated
guidelines
enhance
reliability
fields.
underscores
potential
while
emphasizing
importance
human
oversight
further
refinements
applications.
PEC Innovation,
Journal Year:
2025,
Volume and Issue:
6, P. 100390 - 100390
Published: April 5, 2025
This
study
evaluated
the
understandability,
actionability,
and
readability
of
text
on
anemia
generated
by
artificial
intelligence
(AI)
chatbots.
cross-sectional
compared
texts
ChatGPT-3.5,
Microsoft
Copilot,
Google
Gemini
at
three
levels:
"normal,"
"6th
grade,"
"PEMAT-P
version."
Additionally,
retrieved
from
top
eight
Search
results
for
relevant
keywords
were
included
comparison.
All
written
in
Japanese.
The
Japanese
version
PEMAT-P
was
used
to
assess
understandability
while
jReadability
readability.
A
systematic
comparison
conducted
identify
strengths
weaknesses
each
source.
Texts
6th-grade
level
(n
=
26,
86.7
%)
27,
90.0
%),
as
well
ChatGPT-3.5
normal
21,
80.8
achieved
significantly
higher
scores
(≥70
actionability
17,
25.4
%,
p
<
0.001).
For
readability,
Copilot
demonstrated
percentages
"very
readable"
"somewhat
difficult"
levels
than
(p
0.000-0.007).
is
first
objectively
quantitatively
evaluate
educational
materials
prevention.
By
utilizing
jReadability,
superiority
terms
through
measurable
data.
innovative
approach
highlights
potential
AI
chatbots
a
novel
method
providing
public
health
information
addressing
disparities.
AI-generated
found
be
more
readable
easier
understand
traditional
web-based
texts,
with
demonstrating
highest
understandability.
Moving
forward,
improvements
prompts
will
necessary
enhance
integration
visual
elements
that
encourage
actionable
responses