Nature Communications,
Journal Year:
2024,
Volume and Issue:
15(1)
Published: June 29, 2024
Abstract
When
processing
language,
the
brain
is
thought
to
deploy
specialized
computations
construct
meaning
from
complex
linguistic
structures.
Recently,
artificial
neural
networks
based
on
Transformer
architecture
have
revolutionized
field
of
natural
language
processing.
Transformers
integrate
contextual
information
across
words
via
structured
circuit
computations.
Prior
work
has
focused
internal
representations
(“embeddings”)
generated
by
these
circuits.
In
this
paper,
we
instead
analyze
directly:
deconstruct
into
functionally-specialized
“transformations”
that
words.
Using
functional
MRI
data
acquired
while
participants
listened
naturalistic
stories,
first
verify
transformations
account
for
considerable
variance
in
activity
cortical
network.
We
then
demonstrate
emergent
performed
individual,
“attention
heads”
differentially
predict
specific
regions.
These
heads
fall
along
gradients
corresponding
different
layers
and
context
lengths
a
low-dimensional
space.
Nature Neuroscience,
Journal Year:
2022,
Volume and Issue:
25(3), P. 369 - 380
Published: March 1, 2022
Departing
from
traditional
linguistic
models,
advances
in
deep
learning
have
resulted
a
new
type
of
predictive
(autoregressive)
language
models
(DLMs).
Using
self-supervised
next-word
prediction
task,
these
generate
appropriate
responses
given
context.
In
the
current
study,
nine
participants
listened
to
30-min
podcast
while
their
brain
were
recorded
using
electrocorticography
(ECoG).
We
provide
empirical
evidence
that
human
and
autoregressive
DLMs
share
three
fundamental
computational
principles
as
they
process
same
natural
narrative:
(1)
both
are
engaged
continuous
before
word
onset;
(2)
match
pre-onset
predictions
incoming
calculate
post-onset
surprise;
(3)
rely
on
contextual
embeddings
represent
words
contexts.
Together,
our
findings
suggest
biologically
feasible
framework
for
studying
neural
basis
language.
Proceedings of the National Academy of Sciences,
Journal Year:
2022,
Volume and Issue:
119(32)
Published: Aug. 3, 2022
Understanding
spoken
language
requires
transforming
ambiguous
acoustic
streams
into
a
hierarchy
of
representations,
from
phonemes
to
meaning.
It
has
been
suggested
that
the
brain
uses
prediction
guide
interpretation
incoming
input.
However,
role
in
processing
remains
disputed,
with
disagreement
about
both
ubiquity
and
representational
nature
predictions.
Here,
we
address
issues
by
analyzing
recordings
participants
listening
audiobooks,
using
deep
neural
network
(GPT-2)
precisely
quantify
contextual
First,
establish
responses
words
are
modulated
ubiquitous
Next,
disentangle
model-based
predictions
distinct
dimensions,
revealing
dissociable
signatures
syntactic
category
(parts
speech),
phonemes,
semantics.
Finally,
show
high-level
(word)
inform
low-level
(phoneme)
predictions,
supporting
hierarchical
predictive
processing.
Together,
these
results
underscore
processing,
showing
spontaneously
predicts
upcoming
at
multiple
levels
abstraction.
Communications Biology,
Journal Year:
2022,
Volume and Issue:
5(1)
Published: Feb. 16, 2022
Deep
learning
algorithms
trained
to
predict
masked
words
from
large
amount
of
text
have
recently
been
shown
generate
activations
similar
those
the
human
brain.
However,
what
drives
this
similarity
remains
currently
unknown.
Here,
we
systematically
compare
a
variety
deep
language
models
identify
computational
principles
that
lead
them
brain-like
representations
sentences.
Specifically,
analyze
brain
responses
400
isolated
sentences
in
cohort
102
subjects,
each
recorded
for
two
hours
with
functional
magnetic
resonance
imaging
(fMRI)
and
magnetoencephalography
(MEG).
We
then
test
where
when
these
maps
onto
responses.
Finally,
estimate
how
architecture,
training,
performance
independently
account
generation
representations.
Our
analyses
reveal
main
findings.
First,
between
primarily
depends
on
their
ability
context.
Second,
reveals
rise
maintenance
perceptual,
lexical,
compositional
within
cortical
region.
Overall,
study
shows
modern
partially
converge
towards
solutions,
thus
delineates
promising
path
unravel
foundations
natural
processing.
Nature Human Behaviour,
Journal Year:
2023,
Volume and Issue:
7(3), P. 430 - 441
Published: March 2, 2023
Abstract
Considerable
progress
has
recently
been
made
in
natural
language
processing:
deep
learning
algorithms
are
increasingly
able
to
generate,
summarize,
translate
and
classify
texts.
Yet,
these
models
still
fail
match
the
abilities
of
humans.
Predictive
coding
theory
offers
a
tentative
explanation
this
discrepancy:
while
optimized
predict
nearby
words,
human
brain
would
continuously
hierarchy
representations
that
spans
multiple
timescales.
To
test
hypothesis,
we
analysed
functional
magnetic
resonance
imaging
signals
304
participants
listening
short
stories.
First,
confirmed
activations
modern
linearly
map
onto
responses
speech.
Second,
showed
enhancing
with
predictions
span
timescales
improves
mapping.
Finally,
organized
hierarchically:
frontoparietal
cortices
higher-level,
longer-range
more
contextual
than
temporal
cortices.
Overall,
results
strengthen
role
hierarchical
predictive
processing
illustrate
how
synergy
between
neuroscience
artificial
intelligence
can
unravel
computational
bases
cognition.
Cognitive Science,
Journal Year:
2023,
Volume and Issue:
47(7)
Published: July 1, 2023
Humans
can
attribute
beliefs
to
others.
However,
it
is
unknown
what
extent
this
ability
results
from
an
innate
biological
endowment
or
experience
accrued
through
child
development,
particularly
exposure
language
describing
others'
mental
states.
We
test
the
viability
of
hypothesis
by
assessing
whether
models
exposed
large
quantities
human
display
sensitivity
implied
knowledge
states
characters
in
written
passages.
In
pre-registered
analyses,
we
present
a
linguistic
version
False
Belief
Task
both
participants
and
model,
GPT-3.
Both
are
sensitive
beliefs,
but
while
model
significantly
exceeds
chance
behavior,
does
not
perform
as
well
humans
nor
explain
full
their
behavior-despite
being
more
than
would
lifetime.
This
suggests
that
statistical
learning
may
part
how
develop
reason
about
others,
other
mechanisms
also
responsible.