Journal of Nursing Scholarship,
Год журнала:
2024,
Номер
unknown
Опубликована: Ноя. 24, 2024
Abstract
Aim
The
aim
of
this
study
was
to
evaluate
and
compare
artificial
intelligence
(AI)‐based
large
language
models
(LLMs)
(ChatGPT‐3.5,
Bing,
Bard)
with
human‐based
formulations
in
generating
relevant
clinical
queries,
using
comprehensive
methodological
evaluations.
Methods
To
interact
the
major
LLMs
ChatGPT‐3.5,
Bing
Chat,
Google
Bard,
scripts
prompts
were
designed
formulate
PICOT
(population,
intervention,
comparison,
outcome,
time)
questions
search
strategies.
Quality
responses
assessed
a
descriptive
approach
independent
assessment
by
two
researchers.
determine
number
hits,
PubMed,
Web
Science,
Cochrane
Library,
CINAHL
Ultimate
results
imported
separately,
without
restrictions,
strings
generated
three
an
additional
one
expert.
Hits
from
scenarios
also
exported
for
relevance
evaluation.
use
single
scenario
chosen
provide
focused
analysis.
Cronbach's
alpha
intraclass
correlation
coefficient
(ICC)
calculated.
Results
In
five
different
scenarios,
ChatGPT‐3.5
11,859
1,376,854,
Bard
16,583,
expert
5919
hits.
We
then
used
first
assess
obtained
results.
human
resulted
65.22%
(56/105)
articles.
most
accurate
AI‐based
LLM
70.79%
(63/89),
followed
21.05%
(12/45),
13.29%
(42/316)
Based
on
evaluators,
received
highest
score
(
M
=
48.50;
SD
0.71).
showed
high
level
agreement
between
evaluators.
Although
lower
percentage
hits
compared
reflects
nuanced
evaluation
criteria,
where
subjective
prioritized
contextual
accuracy
quality
over
mere
relevance.
Conclusion
This
provides
valuable
insights
into
ability
LLMs,
such
as
demonstrate
significant
potential
augmenting
workflows,
improving
query
development,
supporting
However,
findings
highlight
limitations
that
necessitate
further
refinement
continued
oversight.
Clinical
Relevance
AI
could
assist
nurses
formulating
offer
support
healthcare
professionals
structure
enhancing
strategies,
thereby
significantly
increasing
efficiency
information
retrieval.
Research Square (Research Square),
Год журнала:
2024,
Номер
unknown
Опубликована: Окт. 16, 2024
Abstract
Background
Artificial
Intelligence-based
chatbots
have
phenomenal
popularity
in
various
areas
including
spreading
medical
information.
To
assess
the
features
of
two
different
on
providing
space
maintainer
related
information
for
pediatric
patients
and
parents.
Methods
12
maintainer-related
questions
were
formed
accordance
with
current
guidelines
directed
to
ChatGPT-3.5
ChatGPT-4.
The
answers
assessed
regarding
criteria
quality,
reliability,
readability,
similarity
previous
papers
by
recruiting
tools
EQIP,
DISCERN,
FRES,
FKRGL
calculation,
GQS,
Similarity
Index.
Results
4
revealed
that
both
similar
mean
values
parameters.
an
outstanding
quality
ChatGPT-4
a
good
4.58
±
0.515
4.33
0.492,
respectively.
also
performed
high
reliability
3.33
0.492
3.58
(ChatGPT-3.5,
ChatGPT-4;
respectively).
readability
scores
seemed
require
education
college
degree
levels
lesser
than
10%
whit
originality.
Conclusions
outcome
this
study
shows
AI-based
chatbots,
ChatGPT
receiving
can
be
useful
attempt
those
who
are
seeking
maintainers
internet.
European Journal of Therapeutics,
Год журнала:
2024,
Номер
30(6), С. 900 - 909
Опубликована: Дек. 31, 2024
Objective:
Chatbots
have
been
frequently
used
in
many
different
areas
recent
years,
such
as
diagnosis
and
imaging,
treatment,
patient
follow-up
support,
health
promotion,
customer
service,
sales,
marketing,
information
technical
support.
The
aim
of
this
study
is
to
evaluate
the
readability,
comprehensibility,
accuracy
queries
made
by
researchers
field
through
artificial
intelligence
chatbots
biostatistics.
Methods:
A
total
10
questions
from
topics
asked
basic
biostatistics
were
determined
4
experts.
addressed
one
experts
answers
recorded.
In
study,
free
versions
most
widely
preferred
ChatGPT4,
Gemini
Copilot
used.
recorded
independently
evaluated
“Correct”,
“Partially
correct”
“Wrong”
three
who
blinded
which
chatbot
belonged
to.
Then,
these
came
together
examined
final
evaluation
reaching
a
consensus
on
levels
accuracy.
readability
understandability
with
Ateşman
formula,
Sönmez
Çetinkaya-Uzun
formula
Bezirci-Yılmaz
formulas.
Results:
According
given
chatbots,
it
was
that
at
“difficult”
level
according
“insufficient
reading
level”
“academic
formula.
On
other
hand,
gave
result
“the
text
understandable”
for
all
chatbots.
It
there
no
statistically
significant
difference
(p=0.819)
terms
rates
questions.
Conclusion:
although
tended
provide
accurate
information,
not
readable,
understandable
their
high.
Journal of Nursing Scholarship,
Год журнала:
2024,
Номер
unknown
Опубликована: Ноя. 24, 2024
Abstract
Aim
The
aim
of
this
study
was
to
evaluate
and
compare
artificial
intelligence
(AI)‐based
large
language
models
(LLMs)
(ChatGPT‐3.5,
Bing,
Bard)
with
human‐based
formulations
in
generating
relevant
clinical
queries,
using
comprehensive
methodological
evaluations.
Methods
To
interact
the
major
LLMs
ChatGPT‐3.5,
Bing
Chat,
Google
Bard,
scripts
prompts
were
designed
formulate
PICOT
(population,
intervention,
comparison,
outcome,
time)
questions
search
strategies.
Quality
responses
assessed
a
descriptive
approach
independent
assessment
by
two
researchers.
determine
number
hits,
PubMed,
Web
Science,
Cochrane
Library,
CINAHL
Ultimate
results
imported
separately,
without
restrictions,
strings
generated
three
an
additional
one
expert.
Hits
from
scenarios
also
exported
for
relevance
evaluation.
use
single
scenario
chosen
provide
focused
analysis.
Cronbach's
alpha
intraclass
correlation
coefficient
(ICC)
calculated.
Results
In
five
different
scenarios,
ChatGPT‐3.5
11,859
1,376,854,
Bard
16,583,
expert
5919
hits.
We
then
used
first
assess
obtained
results.
human
resulted
65.22%
(56/105)
articles.
most
accurate
AI‐based
LLM
70.79%
(63/89),
followed
21.05%
(12/45),
13.29%
(42/316)
Based
on
evaluators,
received
highest
score
(
M
=
48.50;
SD
0.71).
showed
high
level
agreement
between
evaluators.
Although
lower
percentage
hits
compared
reflects
nuanced
evaluation
criteria,
where
subjective
prioritized
contextual
accuracy
quality
over
mere
relevance.
Conclusion
This
provides
valuable
insights
into
ability
LLMs,
such
as
demonstrate
significant
potential
augmenting
workflows,
improving
query
development,
supporting
However,
findings
highlight
limitations
that
necessitate
further
refinement
continued
oversight.
Clinical
Relevance
AI
could
assist
nurses
formulating
offer
support
healthcare
professionals
structure
enhancing
strategies,
thereby
significantly
increasing
efficiency
information
retrieval.