Scientific Reports,
Journal Year:
2025,
Volume and Issue:
15(1)
Published: Jan. 29, 2025
Visual
diagnosis
is
one
of
the
key
features
squamous
cell
carcinoma
oral
cavity
(OSCC)
and
oropharynx
(OPSCC),
both
subsets
head
neck
(HNSCC)
with
a
heterogeneous
clinical
appearance.
Advancements
in
artificial
intelligence
led
to
Image
recognition
being
introduced
recently
into
large
language
models
(LLMs)
such
as
ChatGPT
4.0.
This
exploratory
study,
for
first
time,
evaluated
application
image
by
diagnose
leukoplakia
based
on
images,
images
without
any
lesion
control
group.
A
total
45
were
analyzed,
comprising
15
cases
each
SCC,
leukoplakia,
non-lesion
images.
4.0
was
tasked
providing
most
likely
these
scenario
one.
In
two
history
provided,
whereas
three
only
given.
The
results
accuracy
LLM
rated
independent
reviewers
overall
performance
using
modified
Artificial
Intelligence
Performance
Index
(AIPI.
this
demonstrated
ability
correctly
identify
alone,
while
SCC
insufficient,
but
improved
including
prompt.
Providing
resulted
misclassification
some
cases.
Oral
lesions
more
be
diagnosed
correctly.
study
lesions,
convincing
detecting
when
added,
Leukoplakia
detected
solely
recognition.
therefore
currently
insufficient
reliable
OPSCC
OSCC
diagnosis,
further
technological
advancements
may
pave
way
use
setting.
Clinical Medicine Insights Oncology,
Journal Year:
2025,
Volume and Issue:
19
Published: Jan. 1, 2025
Despite
the
expanding
therapeutic
options
available
to
cancer
patients,
resistance,
disease
recurrence,
and
metastasis
persist
as
hallmark
challenges
in
treatment
of
cancer.
The
rise
prominence
generative
artificial
intelligence
(GenAI)
many
realms
human
activities
is
compelling
consideration
its
capabilities
a
potential
lever
advance
development
effective
treatments.
This
article
presents
hypothetical
case
study
on
application
pre-trained
transformers
(GPTs)
metastatic
prostate
(mPC).
explores
design
GPT-supported
adaptive
intermittent
therapy
for
mPC.
Testosterone
prostate-specific
antigen
(PSA)
are
assumed
be
repeatedly
monitored
while
may
involve
combination
androgen
deprivation
(ADT),
receptor-signalling
inhibitors
(ARSI),
chemotherapy,
radiotherapy.
analysis
covers
various
questions
relevant
configuration,
training,
inferencing
GPTs
mPC
with
particular
attention
risk
mitigation
regarding
hallucination
problem
implications
clinical
integration
GenAI
technologies.
provides
elements
an
actionable
pathway
realization
GenAI-assisted
As
such,
expected
help
facilitate
trials
GenAI-supported
Journal of Medical Internet Research,
Journal Year:
2024,
Volume and Issue:
26, P. e54985 - e54985
Published: Sept. 10, 2024
ChatGPT
(OpenAI)
has
shown
great
potential
in
clinical
diagnosis
and
could
become
an
excellent
auxiliary
tool
practice.
This
study
investigates
evaluates
diagnostic
capabilities
by
comparing
the
performance
of
GPT-3.5
GPT-4.0
across
model
iterations.
BACKGROUND
Polycystic
ovary
syndrome
(PCOS)
is
a
prevalent
condition
requiring
effective
patient
education,
particularly
in
China.
Large
language
models
(LLMs)
present
promising
avenue
for
this.
This
two-phase
study
evaluates
six
LLMs
educating
Chinese
patients
about
PCOS.
It
assesses
their
capabilities
answering
questions,
interpreting
ultrasound
images,
and
providing
instructions
within
real-world
clinical
setting
OBJECTIVE
systematically
evaluated
gigantic
models—Gemini
2.0
Pro,
OpenAI
o1,
ChatGPT-4o,
ChatGPT-4,
ERINE
4.0,
GLM-4—for
use
gynecological
medicine.
assessed
performance
several
areas:
questions
from
the
Gynecology
Qualification
Examination,
understanding
coping
with
polycystic
cases,
writing
instructions,
helping
to
solve
problems.
METHODS
A
two-step
evaluation
method
was
used.
Primarily,
they
tested
frameworks
on
136
exam
36
images.
They
then
compared
results
those
of
medical
students
residents.
Six
gynecologists
framework's
responses
23
PCOS-related
using
Likert
scale,
readability
tool
used
review
content
objectively.
In
following
process,
40
PCOS
two
central
systems,
Gemini
Pro
o1.
them
terms
satisfaction,
text
readability,
professional
evaluation.
RESULTS
During
initial
phase
testing,
o1
demonstrated
impressive
accuracy
specialist
achieving
rates
93.63%
92.40%,
respectively.
Additionally,
image
diagnostic
tasks
noteworthy,
an
69.44%
reaching
53.70%.
Regarding
response
significantly
outperformed
other
accuracy,
completeness,
practicality,
safety.
However,
its
were
notably
more
complex
(average
score
13.98,
p
=
0.003).
The
second-phase
revealed
that
excelled
(patient
rating
3.45,
<
0.01;
physician
3.35,
0.03),
surpassing
2.65,
2.90).
slightly
lagged
behind
completeness
(3.05
vs.
3.50,
0.04).
CONCLUSIONS
reveals
large
have
considerable
potential
address
issues
faced
by
PCOS,
which
are
capable
accurate
comprehensive
responses.
Nevertheless,
it
still
needs
be
strengthened
so
can
balance
clarity
comprehensiveness.
addition,
big
besides
analyzing
especially
ability
handle
regulation
categories,
improved
meet
practice.
CLINICALTRIAL
None
The American Surgeon,
Journal Year:
2025,
Volume and Issue:
unknown
Published: March 12, 2025
Background
Large
language
models
(LLMs)
are
advanced
tools
capable
of
understanding
and
generating
human-like
text.
This
study
evaluated
the
accuracy
several
commercial
LLMs
in
addressing
clinical
questions
related
to
diagnosis
management
acute
cholecystitis,
as
outlined
Tokyo
Guidelines
2018
(TG18).
We
assessed
their
congruence
with
expert
panel
discussions
presented
guidelines.
Methods
ChatGPT4.0,
Gemini
Advanced,
GPTo1-preview
on
ten
questions.
Eight
derived
from
TG18,
two
were
formulated
by
authors.
Two
authors
independently
rated
each
LLM’s
responses
a
four-point
scale:
(1)
accurate
comprehensive,
(2)
but
not
(3)
partially
accurate,
inaccurate,
(4)
entirely
inaccurate.
A
third
author
resolved
any
scoring
discrepancies.
Then,
we
comparatively
analyzed
performance
ChatGPT4.0
against
newer
large
(LLMs),
specifically
Advanced
GPTo1-preview,
same
set
delineate
respective
strengths
limitations.
Results
provided
consistent
for
90%
It
delivered
“accurate
comprehensive”
answers
4/10
(40%)
5/10
(50%).
One
response
(10%)
was
“partially
inaccurate.”
demonstrated
higher
some
yielded
similar
percentage
inaccurate”
responses.
Notably,
neither
model
produced
“entirely
answers.
Discussion
LLMs,
such
ChatGPT
demonstrate
potential
accurately
regarding
cholecystitis.
With
awareness
limitations,
careful
implementation,
ongoing
refinement,
could
serve
valuable
resources
physician
education
patient
information,
potentially
improving
decision-making
future.
Digital Health,
Journal Year:
2025,
Volume and Issue:
11
Published: April 1, 2025
Background
With
the
development
of
information
age,
an
increasing
number
patients
are
seeking
about
related
diseases
on
Internet.
In
medical
field,
several
studies
have
confirmed
that
ChatGPT
has
great
potential
for
use
in
education,
generating
imaging
reports,
and
even
providing
clinical
diagnosis
treatment
decisions,
but
its
ability
to
answer
questions
gallstones
not
yet
been
reported
literature.
Objective
The
aim
this
study
was
evaluate
consistency
accuracy
ChatGPT-generated
answers
cholelithiasis,
compared
provided
by
expert.
Methods
This
designs
task
based
practice
guidelines
cholelithiasis.
presented
form
keywords.
categorized
into
general
professional
questions.
To
performance
expert
answers,
employs
a
modified
matching
scoring
system,
keyword
proportion
evaluation
DISCERN
tool.
Results
often
provides
more
keywords
responses,
is
significantly
lower
than
doctors
(
P
<
.001).
33
questions,
demonstrated
similar
both
score
system
=
.856
.829,
respectively).
However,
32
consistently
outperformed
.004
.016).
Additionally,
while
tool
showed
differences
between
.001),
types
were
evaluated
at
high
level
overall.
Conclusions
Currently,
performs
similarly
experts
answering
it
cannot
replace
decision-making.
As
ChatGPT's
improves
through
deep
learning,
expected
become
useful
effective
field
Nevertheless,
specialized
areas,
careful
attention
continuous
will
be
necessary
ensure
accuracy,
reliability,
safety
field.