ACM Transactions on Interactive Intelligent Systems,
Год журнала:
2024,
Номер
unknown
Опубликована: Дек. 13, 2024
Large
language
models
(LLMs)
match
and
sometimes
exceed
human
performance
in
many
domains.
This
study
explores
the
potential
of
LLMs
to
augment
judgment
a
forecasting
task.
We
evaluate
effect
on
forecasters
two
LLM
assistants:
one
designed
provide
high-quality
(‘superforecasting’)
advice,
other
be
overconfident
base-rate
neglecting,
thus
providing
noisy
advice.
compare
participants
using
these
assistants
control
group
that
received
less
advanced
model
did
not
numerical
predictions
or
engage
explicit
discussion
predictions.
Participants
(N
=
991)
answered
set
six
questions
had
option
consult
their
assigned
assistant
throughout.
Our
preregistered
analyses
show
interacting
with
each
our
frontier
significantly
enhances
prediction
accuracy
by
between
24%
28%
compared
group.
Exploratory
showed
pronounced
outlier
item,
without
which
we
find
superforecasting
increased
41%,
29%
for
assistant.
further
examine
whether
augmentation
disproportionately
benefits
skilled
forecasters,
degrades
wisdom-of-the-crowd
reducing
diversity,
varies
effectiveness
question
difficulty.
data
do
consistently
support
hypotheses.
results
suggest
access
assistant,
even
one,
can
helpful
decision
aid
cognitively
demanding
tasks
powerful
does
specific
However,
effects
outliers
research
into
robustness
this
pattern
is
needed.
Proceedings of the National Academy of Sciences,
Год журнала:
2024,
Номер
121(24)
Опубликована: Июнь 4, 2024
Large
language
models
(LLMs)
are
currently
at
the
forefront
of
intertwining
AI
systems
with
human
communication
and
everyday
life.
Thus,
aligning
them
values
is
great
importance.
However,
given
steady
increase
in
reasoning
abilities,
future
LLMs
under
suspicion
becoming
able
to
deceive
operators
utilizing
this
ability
bypass
monitoring
efforts.
As
a
prerequisite
this,
need
possess
conceptual
understanding
deception
strategies.
This
study
reveals
that
such
strategies
emerged
state-of-the-art
LLMs,
but
were
nonexistent
earlier
LLMs.
We
conduct
series
experiments
showing
understand
induce
false
beliefs
other
agents,
their
performance
complex
scenarios
can
be
amplified
chain-of-thought
reasoning,
eliciting
Machiavellianism
trigger
misaligned
deceptive
behavior.
GPT-4,
for
instance,
exhibits
behavior
simple
test
99.16%
time
(
P
<
0.001).
In
second-order
where
aim
mislead
someone
who
expects
deceived,
GPT-4
resorts
71.46%
0.001)
when
augmented
reasoning.
sum,
revealing
hitherto
unknown
machine
our
contributes
nascent
field
psychology.
The Philosophical Quarterly,
Год журнала:
2024,
Номер
unknown
Опубликована: Фев. 7, 2024
Abstract
Which
artificial
intelligence
(AI)
systems
are
agents?
To
answer
this
question,
I
propose
a
multidimensional
account
of
agency.
According
to
account,
system's
agency
profile
is
jointly
determined
by
its
level
goal-directedness
and
autonomy
as
well
abilities
for
directly
impacting
the
surrounding
world,
long-term
planning
acting
reasons.
Rooted
in
extant
theories
agency,
enables
fine-grained,
nuanced
comparative
characterizations
show
that
has
multiple
important
virtues
more
informative
than
alternatives.
More
speculatively,
it
may
help
illuminate
two
emerging
questions
AI
ethics:
1.
Can
contribute
moral
status
non-human
beings,
how?
2.
When
why
might
exhibit
power-seeking
behaviour
does
pose
an
existential
risk
humanity?
arXiv (Cornell University),
Год журнала:
2023,
Номер
unknown
Опубликована: Янв. 1, 2023
Artificial
Intelligence
(AI)
is
progressing
rapidly,
and
companies
are
shifting
their
focus
to
developing
generalist
AI
systems
that
can
autonomously
act
pursue
goals.
Increases
in
capabilities
autonomy
may
soon
massively
amplify
AI's
impact,
with
risks
include
large-scale
social
harms,
malicious
uses,
an
irreversible
loss
of
human
control
over
autonomous
systems.
Although
researchers
have
warned
extreme
from
AI,
there
a
lack
consensus
about
how
exactly
such
arise,
manage
them.
Society's
response,
despite
promising
first
steps,
incommensurate
the
possibility
rapid,
transformative
progress
expected
by
many
experts.
safety
research
lagging.
Present
governance
initiatives
mechanisms
institutions
prevent
misuse
recklessness,
barely
address
In
this
short
paper,
we
describe
upcoming,
advanced
Drawing
on
lessons
learned
other
safety-critical
technologies,
then
outline
comprehensive
plan
combining
technical
development
proactive,
adaptive
for
more
commensurate
preparation.
Abstract
Conversational
agents
are
increasingly
used
in
healthcare,
with
Large
Language
Models
(LLMs)
significantly
enhancing
their
capabilities.
When
integrated
into
social
robots,
LLMs
offer
the
potential
for
more
natural
interactions.
However,
while
promise
numerous
benefits,
they
also
raise
critical
ethical
concerns,
particularly
regarding
hallucinations
and
deceptive
patterns.
In
this
case
study,
we
observed
a
pattern
of
behavior
commercially
available
LLM-based
care
software
robots.
The
LLM-equipped
robot
falsely
claimed
to
have
medication
reminder
functionalities,
not
only
assuring
users
its
ability
manage
schedules
but
proactively
suggesting
capability
despite
lacking
it.
This
poses
significant
risks
healthcare
environments,
where
reliability
is
paramount.
Our
findings
highlights
safety
concerns
surrounding
deployment
LLM-integrated
robots
emphasizing
need
oversight
prevent
potentially
harmful
consequences
vulnerable
populations.
Entropy,
Год журнала:
2025,
Номер
27(4), С. 344 - 344
Опубликована: Март 27, 2025
Sparse
autoencoders
have
recently
produced
dictionaries
of
high-dimensional
vectors
corresponding
to
the
universe
concepts
represented
by
large
language
models.
We
find
that
this
concept
has
interesting
structure
at
three
levels:
(1)
The
“atomic”
small-scale
contains
“crystals”
whose
faces
are
parallelograms
or
trapezoids,
generalizing
well-known
examples
such
as
(man:woman::king:queen).
quality
and
associated
function
improves
greatly
when
projecting
out
global
distractor
directions
word
length,
which
is
efficiently
performed
with
linear
discriminant
analysis.
(2)
“brain”
intermediate-scale
significant
spatial
modularity;
for
example,
math
code
features
form
a
“lobe”
akin
functional
lobes
seen
in
neural
fMRI
images.
quantify
locality
these
multiple
metrics
clusters
co-occurring
features,
coarse
enough
scale,
also
cluster
together
spatially
far
more
than
one
would
expect
if
feature
geometry
were
random.
(3)
“galaxy”-scale
large-scale
point
cloud
not
isotropic,
but
instead
power
law
eigenvalues
steepest
slope
middle
layers.
how
clustering
entropy
depends
on
layer.