AI versus human-generated multiple-choice questions for medical education: a cohort study in a high-stakes examination
BMC Medical Education,
Год журнала:
2025,
Номер
25(1)
Опубликована: Фев. 8, 2025
Язык: Английский
Large language models for diabetes training: a prospective study
Science Bulletin,
Год журнала:
2025,
Номер
unknown
Опубликована: Янв. 1, 2025
Язык: Английский
Evaluating the Accuracy, Reliability, Consistency, and Readability of Different Large Language Models in Restorative Dentistry
Journal of Esthetic and Restorative Dentistry,
Год журнала:
2025,
Номер
unknown
Опубликована: Март 2, 2025
This
study
aimed
to
evaluate
the
reliability,
consistency,
and
readability
of
responses
provided
by
various
artificial
intelligence
(AI)
programs
questions
related
Restorative
Dentistry.
Forty-five
knowledge-based
information
20
(10
patient-related
10
dentistry-specific)
were
posed
ChatGPT-3.5,
ChatGPT-4,
ChatGPT-4o,
Chatsonic,
Copilot,
Gemini
Advanced
chatbots.
The
DISCERN
questionnaire
was
used
assess
reliability;
Flesch
Reading
Ease
Flesch-Kincaid
Grade
Level
scores
utilized
readability.
Accuracy
consistency
determined
based
on
chatbots'
questions.
Copilot
demonstrated
"good"
while
ChatGPT-3.5
showed
"fair"
reliability.
Chatsonic
exhibited
highest
"DISCERN
total
score"
for
questions,
ChatGPT-4o
performed
best
dentistry-specific
No
significant
differences
found
in
among
chatbots
(p
>
0.05).
accuracy
(93.3%)
had
lowest
(68.9%).
ChatGPT-4
between
repetitions.
Performance
AIs
varied
terms
accuracy,
when
responding
Dentistry
promising
results
academic
patient
education
applications.
However,
generally
above
recommended
levels
materials.
utilization
AI
has
an
increasing
impact
aspects
dentistry.
Moreover,
if
restorative
dentistry
prove
be
reliable
comprehensible,
this
may
yield
outcomes
future.
Язык: Английский
Evaluating the Performance of Large Language Models in Predicting Diagnostics for Spanish Clinical Cases in Cardiology
Applied Sciences,
Год журнала:
2024,
Номер
15(1), С. 61 - 61
Опубликована: Дек. 25, 2024
This
study
explores
the
potential
of
large
language
models
(LLMs)
in
predicting
medical
diagnoses
from
Spanish-language
clinical
case
descriptions,
offering
an
alternative
to
traditional
machine
learning
(ML)
and
deep
(DL)
techniques.
Unlike
ML
DL
models,
which
typically
rely
on
extensive
domain-specific
training
complex
data
preprocessing,
LLMs
can
process
unstructured
text
directly
without
need
for
specialized
datasets.
unique
characteristic
allows
faster
implementation
eliminates
risks
associated
with
overfitting,
are
common
that
require
tailored
each
new
dataset.
In
this
research,
we
investigate
capacities
several
state-of-the-art
based
Spanish
textual
descriptions
cases.
We
measured
impact
prompt
techniques
temperatures
quality
diagnosis.
Our
results
indicate
Gemini
Pro
Mixtral
8x22b
generally
performed
well
across
different
techniques,
while
Medichat
Llama3
showed
more
variability,
particularly
few-shot
prompting
technique.
Low
specific
such
as
zero-shot
Retrieval-Augmented
Generation
(RAG),
tended
yield
clearer
accurate
diagnoses.
highlights
a
disruptive
approaches,
efficient,
scalable,
flexible
solution
diagnostics,
non-English-speaking
population.
Язык: Английский