Conspiracy
theories
are
a
paradigmatic
example
of
beliefs
that,
once
adopted,
extremely
difficult
to
dispel.
Influential
psychological
propose
that
conspiracy
uniquely
resistant
counterevidence
because
they
satisfy
important
needs
and
motivations.
Here,
we
raise
the
possibility
previous
attempts
correct
have
been
unsuccessful
merely
failed
deliver
was
sufficiently
compelling
tailored
each
believer’s
specific
theory
(which
vary
dramatically
from
believer
believer).
To
evaluate
this
possibility,
leverage
recent
developments
in
generative
artificial
intelligence
(AI)
well-argued,
person-specific
debunks
total
N
=
2,190
believers.
Participants
our
experiments
provided
detailed,
open-ended
explanations
believed,
then
engaged
3
round
dialogue
with
frontier
AI
model
(GPT-4
Turbo)
which
instructed
reduce
participant’s
belief
their
(or
discuss
banal
topic
control
condition).
Across
two
experiments,
find
robust
evidence
debunking
conversation
reduced
by
roughly
20%.
This
effect
did
not
decay
over
2
months
time,
consistently
observed
across
wide
range
different
theories,
occurred
even
for
participants
whose
were
deeply
entrenched
great
importance
identities.
Furthermore,
although
dialogues
focused
on
single
theory,
intervention
spilled
unrelated
conspiracies,
indicating
general
decrease
conspiratorial
worldview,
as
well
increasing
intentions
challenge
others
who
espouse
chosen
conspiracy.
These
findings
highlight
many
people
strongly
believe
seemingly
fact-resistant
can
change
minds
face
sufficient
evidence.
Frontiers of Computer Science,
Journal Year:
2024,
Volume and Issue:
18(6)
Published: March 22, 2024
Abstract
Autonomous
agents
have
long
been
a
research
focus
in
academic
and
industry
communities.
Previous
often
focuses
on
training
with
limited
knowledge
within
isolated
environments,
which
diverges
significantly
from
human
learning
processes,
makes
the
hard
to
achieve
human-like
decisions.
Recently,
through
acquisition
of
vast
amounts
Web
knowledge,
large
language
models
(LLMs)
shown
potential
human-level
intelligence,
leading
surge
LLM-based
autonomous
agents.
In
this
paper,
we
present
comprehensive
survey
these
studies,
delivering
systematic
review
holistic
perspective.
We
first
discuss
construction
agents,
proposing
unified
framework
that
encompasses
much
previous
work.
Then,
overview
diverse
applications
social
science,
natural
engineering.
Finally,
delve
into
evaluation
strategies
commonly
used
for
Based
also
several
challenges
future
directions
field.
Natural Language Processing Journal,
Journal Year:
2023,
Volume and Issue:
6, P. 100048 - 100048
Published: Dec. 19, 2023
Large
language
models
(LLMs)
are
a
special
class
of
pretrained
(PLMs)
obtained
by
scaling
model
size,
pretraining
corpus
and
computation.
LLMs,
because
their
large
size
on
volumes
text
data,
exhibit
abilities
which
allow
them
to
achieve
remarkable
performances
without
any
task-specific
training
in
many
the
natural
processing
tasks.
The
era
LLMs
started
with
OpenAI's
GPT-3
model,
popularity
has
increased
exponentially
after
introduction
like
ChatGPT
GPT4.
We
refer
its
successor
OpenAI
models,
including
GPT4,
as
family
(GLLMs).
With
ever-rising
GLLMs,
especially
research
community,
there
is
strong
need
for
comprehensive
survey
summarizes
recent
progress
multiple
dimensions
can
guide
community
insightful
future
directions.
start
paper
foundation
concepts
transformers,
transfer
learning,
self-supervised
models.
then
present
brief
overview
GLLMs
discuss
various
downstream
tasks,
specific
domains
languages.
also
data
labelling
augmentation
robustness
effectiveness
evaluators,
finally,
conclude
To
summarize,
this
will
serve
good
resource
both
academic
industry
people
stay
updated
latest
related
GLLMs.
The
social
and
behavioral
sciences
have
been
increasingly
using
automated
text
analysis
to
measure
psychological
constructs
in
text.
We
explore
whether
GPT,
the
large-language
model
underlying
artificial
intelligence
chatbot
ChatGPT,
can
be
used
as
a
tool
for
several
languages.
Across
15
datasets
(n
=
47,925
manually
annotated
tweets
news
headlines),
we
tested
different
versions
of
GPT
(3.5
Turbo,
4,
4
Turbo)
accurately
detect
(sentiment,
discrete
emotions,
offensiveness,
moral
foundations)
across
12
found
that
(r
0.59-0.77)
performs
much
better
than
English-language
dictionary
0.20-0.30)
at
detecting
judged
by
manual
annotators.
nearly
well
as,
sometimes
than,
top-performing
fine-tuned
machine
learning
models.
Moreover,
GPT’s
performance
has
improved
successive
model,
particularly
lesser-spoken
Overall,
may
superior
many
existing
methods
analysis,
since
it
achieves
relatively
high
accuracy
languages,
requires
no
training
data,
is
easy
use
with
simple
prompts
(e.g.,
“is
this
negative?”)
little
coding
experience.
provide
sample
code
video
tutorial
analyzing
application
programming
interface.
argue
other
models
democratize
making
advanced
natural
language
processing
capabilities
more
accessible,
help
facilitate
cross-linguistic
research
understudied
Proceedings of the National Academy of Sciences,
Journal Year:
2024,
Volume and Issue:
121(34)
Published: Aug. 12, 2024
The
social
and
behavioral
sciences
have
been
increasingly
using
automated
text
analysis
to
measure
psychological
constructs
in
text.
We
explore
whether
GPT,
the
large-language
model
(LLM)
underlying
AI
chatbot
ChatGPT,
can
be
used
as
a
tool
for
several
languages.
Across
15
datasets
(
n
=
47,925
manually
annotated
tweets
news
headlines),
we
tested
different
versions
of
GPT
(3.5
Turbo,
4,
4
Turbo)
accurately
detect
(sentiment,
discrete
emotions,
offensiveness,
moral
foundations)
across
12
found
that
r
0.59
0.77)
performed
much
better
than
English-language
dictionary
0.20
0.30)
at
detecting
judged
by
manual
annotators.
nearly
well
as,
sometimes
than,
top-performing
fine-tuned
machine
learning
models.
Moreover,
GPT’s
performance
improved
successive
model,
particularly
lesser-spoken
languages,
became
less
expensive.
Overall,
may
superior
many
existing
methods
analysis,
since
it
achieves
relatively
high
accuracy
requires
no
training
data,
is
easy
use
with
simple
prompts
(e.g.,
“is
this
negative?”)
little
coding
experience.
provide
sample
code
video
tutorial
analyzing
application
programming
interface.
argue
other
LLMs
help
democratize
making
advanced
natural
language
processing
capabilities
more
accessible,
facilitate
cross-linguistic
research
understudied
Science,
Journal Year:
2024,
Volume and Issue:
385(6714)
Published: Sept. 12, 2024
Conspiracy
theory
beliefs
are
notoriously
persistent.
Influential
hypotheses
propose
that
they
fulfill
important
psychological
needs,
thus
resisting
counterevidence.
Yet
previous
failures
in
correcting
conspiracy
may
be
due
to
counterevidence
being
insufficiently
compelling
and
tailored.
To
evaluate
this
possibility,
we
leveraged
developments
generative
artificial
intelligence
engaged
2190
believers
personalized
evidence-based
dialogues
with
GPT-4
Turbo.
The
intervention
reduced
belief
by
~20%.
effect
remained
2
months
later,
generalized
across
a
wide
range
of
theories,
occurred
even
among
participants
deeply
entrenched
beliefs.
Although
the
focused
on
single
conspiracy,
nonetheless
diminished
unrelated
conspiracies
shifted
conspiracy-related
behavioral
intentions.
These
findings
suggest
many
can
revise
their
views
if
presented
sufficiently
evidence.
In
an
era
where
artificial
intelligence
is
increasingly
interfacing
with
diverse
cultural
contexts,
the
ability
of
language
models
to
accurately
represent
and
adapt
these
contexts
paramount
importance.The
present
research
undertakes
a
meticulous
evaluation
three
prominent
commercial
models-Google
Gemini
1.5,
ChatGPT-4,
Anthropic's
Claude
3
Sonet-with
focus
on
their
handling
Turkish
language.Through
dual
approach
quantitative
metrics,
Cultural
Inaccuracy
Score
(CIS)
Sensitivity
Index
(CSI),
alongside
qualitative
analyses
via
detailed
case
studies,
disparities
in
model
performances
were
highlighted.Notably,
Sonet
exhibited
superior
sensitivity,
underscoring
effectiveness
its
advanced
training
methodologies.Further
analysis
revealed
that
all
demonstrated
varying
degrees
competence,
suggesting
significant
room
for
improvement.The
findings
emphasize
necessity
enriched
diversified
datasets,
innovative
algorithmic
enhancements,
reduce
inaccuracies
enhance
models'
global
applicability.Strategies
mitigating
hallucinations
are
discussed,
focusing
refinement
processes
continuous
foster
improvements
AI
adaptiveness.The
study
aims
contribute
ongoing
technologies,
ensuring
they
respect
reflect
rich
tapestry
human
cultures.
Large
language
models
(LLMs)
have
shown
remarkable
performance
across
various
natural
processing
(NLP)
tasks,
indicating
their
significant
potential
as
data
annotators.
Although
LLM-generated
annotations
are
more
cost-effective
and
efficient
to
obtain,
they
often
erroneous
for
complex
or
domain-specific
tasks
may
introduce
bias
when
compared
human
annotations.
Therefore,
instead
of
completely
replacing
annotators
with
LLMs,
we
need
leverage
the
strengths
both
LLMs
humans
ensure
accuracy
reliability
This
paper
presents
a
multi-step
human-LLM
collaborative
approach
where
(1)
generate
labels
provide
explanations,
(2)
verifier
assesses
quality
labels,
(3)
re-annotate
subset
lower
verification
scores.
To
facilitate
collaboration,
make
use
LLM's
ability
rationalize
its
decisions.
explanations
can
additional
information
model
well
help
better
understand
LLM
labels.
We
demonstrate
that
our
is
able
identify
potentially
incorrect
re-annotation.
Furthermore,
investigate
impact
presenting
on
re-annotation
through
crowdsourced
studies.
The
application
of
knowledge
distillation
to
reduce
hallucination
in
large
language
models
represents
a
novel
and
significant
advancement
enhancing
the
reliability
accuracy
AI-generated
content.
research
presented
demonstrates
efficacy
transferring
from
high-capacity
teacher
model
more
compact
student
model,
leading
substantial
improvements
exact
match
notable
reductions
rates.
methodology
involved
use
temperature
scaling,
intermediate
layer
matching,
comprehensive
evaluation
using
MMLU
benchmark,
which
assessed
model’s
performance
across
diverse
set
tasks.
Experimental
results
indicated
that
distilled
outperformed
baseline
generating
accurate
contextually
appropriate
responses
while
maintaining
computational
efficiency.
findings
underscore
potential
as
scalable
solution
for
improving
robustness
models,
making
them
applicable
real-world
scenarios
demand
high
factual
accuracy.
Future
directions
include
exploring
multilingual
multi-modal
distillation,
integrating
reinforcement
learning,
developing
refined
metrics
further
enhance
performance.