Jurnal Penelitian Pendidikan IPA,
Journal Year:
2023,
Volume and Issue:
9(12), P. 1335 - 1341
Published: Dec. 20, 2023
Currently
many
forms
of
sophisticated
technology
are
used
by
people.
This
course
cannot
be
separated
from
the
existence
artificial
intelligence.
One
form
is
Chat
GPT
which
was
developed
Open
AI.
Science
as
a
collection
knowledge
result
human
scientific
creative
activity.
The
results
activities
will
produce
in
facts,
concepts,
principles,
laws,
and
theories.
activity
characterized
thought
processes
that
take
place
mind.
With
developing
technology,
one
GPT,
it
make
current
science
learning
process
easier.
research
aims
to
examine
advantages
disadvantages
Chatgpt
Learning:
Systematic
Literature
Review.
review
conducted
based
on
state-of-the-art
methods
using
preferred
reporting
items
for
reviews
meta-analyses
(PRISMA)
guidelines.
this
explain
that.
chat
has
several
uses,
advantages,
learning.
For
reason,
must
wisely
possible,
so
there
no
mistakes
its
application
or
other
International Journal of Surgery,
Journal Year:
2024,
Volume and Issue:
unknown
Published: March 19, 2024
It
has
been
a
year
since
the
launch
of
Chat
Generator
Pre-Trained
Transformer
(ChatGPT),
generative
artificial
intelligence
(AI)
program.
The
introduction
this
cross-generational
product
initially
brought
huge
shock
to
people
with
its
incredible
potential,
and
then
aroused
increasing
concerns
among
people.
In
field
medicine,
researchers
have
extensively
explored
possible
applications
ChatGPT
achieved
numerous
satisfactory
results.
However,
opportunities
issues
always
come
together.
Problems
also
exposed
during
ChatGPT,
requiring
cautious
handling,
thorough
consideration
further
guidelines
for
safe
use.
Here,
we
summarized
potential
in
medical
field,
including
revolutionizing
healthcare
consultation,
assisting
patient
management
treatment,
transforming
education
facilitating
clinical
research.
Meanwhile,
enumerated
researchers’
arising
along
broad
applications.
As
it
is
irreversible
that
AI
will
gradually
permeate
every
aspect
modern
life,
hope
review
can
not
only
promote
people’s
understanding
future,
but
remind
them
be
more
about
“Pandora’s
Box”
field.
necessary
establish
normative
use
as
soon
possible.
Psychology and Marketing,
Journal Year:
2024,
Volume and Issue:
41(6), P. 1254 - 1270
Published: Feb. 10, 2024
Abstract
Should
consumer
researchers
employ
silicon
samples
and
artificially
generated
data
based
on
large
language
models,
such
as
GPT,
to
mimic
human
respondents'
behavior?
In
this
paper,
we
review
recent
research
that
has
compared
result
patterns
from
samples,
finding
results
vary
considerably
across
different
domains.
Based
these
results,
present
specific
recommendations
for
sample
use
in
marketing
research.
We
argue
hold
particular
promise
upstream
parts
of
the
process
qualitative
pretesting
pilot
studies,
where
collect
external
information
safeguard
follow‐up
design
choices.
also
provide
a
critical
assessment
using
main
studies.
Finally,
discuss
ethical
issues
future
avenues.
npj Digital Medicine,
Journal Year:
2024,
Volume and Issue:
7(1)
Published: Sept. 28, 2024
Abstract
With
generative
artificial
intelligence
(GenAI),
particularly
large
language
models
(LLMs),
continuing
to
make
inroads
in
healthcare,
assessing
LLMs
with
human
evaluations
is
essential
assuring
safety
and
effectiveness.
This
study
reviews
existing
literature
on
evaluation
methodologies
for
healthcare
across
various
medical
specialties
addresses
factors
such
as
dimensions,
sample
types
sizes,
selection,
recruitment
of
evaluators,
frameworks
metrics,
process,
statistical
analysis
type.
Our
review
142
studies
shows
gaps
reliability,
generalizability,
applicability
current
practices.
To
overcome
significant
obstacles
LLM
developments
deployments,
we
propose
QUEST,
a
comprehensive
practical
framework
covering
three
phases
workflow:
Planning,
Implementation
Adjudication,
Scoring
Review.
QUEST
designed
five
proposed
principles:
Quality
Information,
Understanding
Reasoning,
Expression
Style
Persona,
Safety
Harm,
Trust
Confidence.
European Archives of Oto-Rhino-Laryngology,
Journal Year:
2024,
Volume and Issue:
281(11), P. 6099 - 6109
Published: Aug. 7, 2024
Head
and
neck
squamous
cell
carcinoma
(HNSCC)
is
a
complex
malignancy
that
requires
multidisciplinary
tumor
board
approach
for
individual
treatment
planning.
In
recent
years,
artificial
intelligence
tools
have
emerged
to
assist
healthcare
professionals
in
making
informed
decisions.
This
study
investigates
the
application
of
newly
published
LLM
Claude
3
Opus
compared
currently
most
advanced
ChatGPT
4.0
diagnosis
therapy
planning
primary
HNSCC.
The
results
were
conventional
board;
(2)
Materials
Methods:
We
conducted
March
2024
on
50
consecutive
head
cancer
cases.
diagnostics
MDT
recommendations
each
patient
rated
by
two
independent
reviewers
following
parameters:
clinical
recommendation,
explanation,
summarization
addition
Artificial
Intelligence
Performance
Instrument
(AIPI);
(3)
Results:
this
study,
achieved
better
scores
diagnostic
workup
patients
than
provided
involving
surgery,
chemotherapy,
radiation
therapy.
terms
recommendations,
explanation
scored
similar
4.0,
listing
which
congruent
with
MDT,
but
failed
cite
source
information;
(4)
Conclusion:
first
analysis
cases
demonstrates
superior
performance
HNSCC
recommendations.
marks
advent
launched
AI
model
may
be
assessment
setting.
Advances in Medical Education and Practice,
Journal Year:
2024,
Volume and Issue:
Volume 15, P. 393 - 400
Published: May 1, 2024
Introduction:
This
research
investigated
the
capabilities
of
ChatGPT-4
compared
to
medical
students
in
answering
MCQs
using
revised
Bloom's
Taxonomy
as
a
benchmark.
Methods:
A
cross-sectional
study
was
conducted
at
The
University
West
Indies,
Barbados.
and
were
assessed
on
from
various
courses
computer-based
testing.
Results:
included
304
MCQs.
Students
demonstrated
good
knowledge,
with
78%
correctly
least
90%
questions.
However,
achieved
higher
overall
score
(73.7%)
(66.7%).
Course
type
significantly
affected
ChatGPT-4's
performance,
but
levels
did
not.
detailed
association
check
between
program
taxonomy
for
correct
answers
by
showed
highly
significant
correlation
(p<
0.001),
reflecting
concentration
"remember-level"
questions
preclinical
"evaluate-level"
clinical
courses.
Discussion:
highlights
proficiency
standardized
tests
indicates
limitations
reasoning
practical
skills.
performance
discrepancy
suggests
that
effectiveness
artificial
intelligence
(AI)
varies
based
course
content.
Conclusion:
While
shows
promise
an
educational
tool,
its
role
should
be
supplementary,
strategic
integration
into
education
leverage
strengths
address
limitations.
Further
is
needed
explore
AI's
impact
student
across
Keywords:
intelligence,
ChatGPT-4's,
students,
interpretation
abilities,
multiple
choice
Quantitative Biology,
Journal Year:
2024,
Volume and Issue:
12(4), P. 360 - 374
Published: June 21, 2024
Understanding
complex
biological
pathways,
including
gene-gene
interactions
and
gene
regulatory
networks,
is
critical
for
exploring
disease
mechanisms
drug
development.
Manual
literature
curation
of
pathways
cannot
keep
up
with
the
exponential
growth
new
discoveries
in
literature.
Large-scale
language
models
(LLMs)
trained
on
extensive
text
corpora
contain
rich
information,
they
can
be
mined
as
a
knowledge
graph.
This
study
assesses
21
LLMs,
both
application
programming
interface
(API)-based
open-source
their
capacities
retrieving
knowledge.
The
evaluation
focuses
predicting
relations
(activation,
inhibition,
phosphorylation)
Kyoto
Encyclopedia
Genes
Genomes
(KEGG)
pathway
components.
Results
indicated
significant
disparity
model
performance.
API-based
GPT-4
Claude-Pro
showed
superior
performance,
an
F1
score
0.4448
0.4386
relation
prediction,
Jaccard
similarity
index
0.2778
0.2657
KEGG
respectively.
Open-source
lagged
behind
counterparts,
whereas
Falcon-180b
llama2-7b
had
highest
scores
0.2787
0.1923
relations,
recognition
0.2237
0.2207
llama2-7b.
Our
suggests
that
LLMs
are
informative
network
analysis
mapping,
but
effectiveness
varies,
necessitating
careful
selection.
work
also
provides
case
insight
into
using
das
graphs.
code
publicly
available
at
website
GitHub
(Muh-aza).
bioRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2024,
Volume and Issue:
unknown
Published: Jan. 24, 2024
Abstract
Background
Understanding
complex
biological
pathways,
including
gene-gene
interactions
and
gene
regulatory
networks,
is
critical
for
exploring
disease
mechanisms
drug
development.
Manual
literature
curation
of
pathways
useful
but
cannot
keep
up
with
the
exponential
growth
literature.
Large-scale
language
models
(LLMs),
notable
their
vast
parameter
sizes
comprehensive
training
on
extensive
text
corpora,
have
great
potential
in
automated
mining
pathways.
Method
This
study
assesses
effectiveness
21
LLMs,
both
API-based
open-source
models.
The
evaluation
focused
two
key
aspects:
relations
(specifically,
‘activation’,
‘inhibition’,
‘phosphorylation’)
KEGG
pathway
component
recognition.
performance
these
was
analyzed
using
statistical
metrics
such
as
precision,
recall,
F1
scores,
Jaccard
similarity
index.
Results
Our
results
indicated
a
significant
disparity
model
performance.
Among
models,
ChatGPT-4
Claude-Pro
showed
superior
performance,
an
score
0.4448
0.4386
relation
prediction,
index
0.2778
0.2657
respectively.
Open-source
lagged
counterparts,
where
Falcon-180b-chat
llama1-7b
led
highest
(F1
0.2787
0.1923,
respectively)
recognition
(Jaccard
0.2237
0.
2207,
respectively).
Conclusion
LLMs
are
valuable
biomedical
research,
especially
network
analysis
mapping.
However,
varies,
necessitating
careful
selection.
work
also
provided
case
insight
into
knowledge
graphs.
Open Journal of Obstetrics and Gynecology,
Journal Year:
2025,
Volume and Issue:
15(01), P. 1 - 9
Published: Jan. 1, 2025
Objective:
This
study
assesses
the
quality
of
artificial
intelligence
chatbots
in
responding
to
standardized
obstetrics
and
gynecology
questions.
Methods:
Using
ChatGPT-3.5,
ChatGPT-4.0,
Bard,
Claude
respond
20
multiple
choice
questions
on
October
7,
2023,
responses
correctness
were
recorded.
A
logistic
regression
model
assessed
relationship
between
question
character
count
accuracy.
For
each
incorrect
question,
an
independent
error
analysis
was
undertaken.
Results:
ChatGPT-4.0
scored
a
100%
across
both
ChatGPT-3.5
95%
overall,
earning
85.7%
gynecology.
90%
84.6%
Bard
77.8%
83.3%
75%
would
not
two
There
no
statistical
significance
Conclusions:
excelled
while
performed
well
but
possessed
minor
weaknesses
comparatively
worst
had
most
limitations,
leading
our
support
other
as
preferred
tools.
Our
findings
use
supplement,
substitute
for
clinician-based
learning
or
historically
successful
educational