bioRxiv (Cold Spring Harbor Laboratory),
Год журнала:
2024,
Номер
unknown
Опубликована: Сен. 13, 2024
Abstract
Forming
an
episodic
memory
requires
binding
together
disparate
elements
that
co-occur
in
a
single
experience.
One
model
of
this
process
is
neurons
representing
different
components
bind
to
“index”
—
subset
unique
memory.
Evidence
for
has
recently
been
found
chickadees,
which
use
hippocampal
store
and
recall
locations
cached
food.
Chickadee
hippocampus
produces
sparse,
high-dimensional
patterns
(“barcodes”)
uniquely
specify
each
caching
event.
Unexpectedly,
the
same
participate
barcodes
also
exhibit
conventional
place
tuning.
It
unknown
how
barcode
activity
generated,
what
role
it
plays
formation
retrieval.
unclear
index
(e.g.
barcodes)
could
function
neural
population
represents
content
place).
Here,
we
design
biologically
plausible
generates
uses
them
experiential
content.
Our
from
inputs
through
chaotic
dynamics
recurrent
network
Hebbian
plasticity
as
attractor
states.
The
matches
experimental
observations
indices
(barcodes)
signals
(place
tuning)
are
randomly
intermixed
neurons.
We
demonstrate
reduce
interference
between
correlated
experiences.
show
tuning
complementary
barcodes,
enabling
flexible,
contextually-appropriate
Finally,
our
compatible
with
previous
models
generating
predictive
map.
Distinct
indexing
functions
achieved
via
adjustment
global
gain.
results
suggest
may
resolve
fundamental
tensions
specificity
(pattern
separation)
flexible
completion)
general
systems.
Neural Computation,
Год журнала:
2024,
Номер
36(11), С. 2225 - 2298
Опубликована: Авг. 30, 2024
Abstract
Adaptive
behavior
often
requires
predicting
future
events.
The
theory
of
reinforcement
learning
prescribes
what
kinds
predictive
representations
are
useful
and
how
to
compute
them.
This
review
integrates
these
theoretical
ideas
with
work
on
cognition
neuroscience.
We
pay
special
attention
the
successor
representation
its
generalizations,
which
have
been
widely
applied
as
both
engineering
tools
models
brain
function.
convergence
suggests
that
particular
may
function
versatile
building
blocks
intelligence.
Proceedings of the National Academy of Sciences,
Год журнала:
2024,
Номер
121(17)
Опубликована: Апрель 17, 2024
Design
of
hardware
based
on
biological
principles
neuronal
computation
and
plasticity
in
the
brain
is
a
leading
approach
to
realizing
energy-
sample-efficient
AI
learning
machines.
An
important
factor
selection
building
blocks
identification
candidate
materials
with
physical
properties
suitable
emulate
large
dynamic
ranges
varied
timescales
signaling.
Previous
work
has
shown
that
all-or-none
spiking
behavior
neurons
can
be
mimicked
by
threshold
switches
utilizing
material
phase
transitions.
Here,
we
demonstrate
devices
prototypical
metal-insulator-transition
material,
vanadium
dioxide
(VO
2
),
dynamically
controlled
access
continuum
intermediate
resistance
states.
Furthermore,
timescale
their
intrinsic
relaxation
configured
match
range
biologically
relevant
from
milliseconds
seconds.
We
exploit
these
device
three
aspects
analog
computation:
fast
(~1
ms)
soma
compartment,
slow
(~100
dendritic
ultraslow
s)
biochemical
signaling
involved
temporal
credit
assignment
for
recently
discovered
mechanism
one-shot
learning.
Simulations
show
an
artificial
neural
network
using
VO
control
agent
navigating
spatial
environment
learn
efficient
path
reward
up
fourfold
fewer
trials
than
standard
methods.
The
relaxations
described
our
study
may
engineered
variety
thermal,
electrical,
or
optical
stimuli,
suggesting
further
opportunities
neuromorphic
hardware.
Forming
an
episodic
memory
requires
binding
together
disparate
elements
that
co-occur
in
a
single
experience.
One
model
of
this
process
is
neurons
representing
different
components
bind
to
“index”
—
subset
unique
memory.
Evidence
for
has
recently
been
found
chickadees,
which
use
hippocampal
store
and
recall
locations
cached
food.
Chickadee
hippocampus
produces
sparse,
high-dimensional
patterns
(“barcodes”)
uniquely
specify
each
caching
event.
Unexpectedly,
the
same
participate
barcodes
also
exhibit
conventional
place
tuning.
It
unknown
how
barcode
activity
generated,
what
role
it
plays
formation
retrieval.
unclear
index
(e.g.
barcodes)
could
function
neural
population
represents
content
place).
Here,
we
design
biologically
plausible
generates
uses
them
experiential
content.
Our
from
inputs
through
chaotic
dynamics
recurrent
network
Hebbian
plasticity
as
attractor
states.
The
matches
experimental
observations
indices
(barcodes)
signals
(place
tuning)
are
randomly
intermixed
neurons.
We
demonstrate
reduce
interference
between
correlated
experiences.
show
tuning
complementary
barcodes,
enabling
flexible,
contextually-appropriate
Finally,
our
compatible
with
previous
models
generating
predictive
map.
Distinct
indexing
functions
achieved
via
adjustment
global
gain.
results
suggest
may
resolve
fundamental
tensions
specificity
(pattern
separation)
flexible
completion)
general
systems.
Forming
an
episodic
memory
requires
binding
together
disparate
elements
that
co-occur
in
a
single
experience.
One
model
of
this
process
is
neurons
representing
different
components
bind
to
“index”
—
subset
unique
memory.
Evidence
for
has
recently
been
found
chickadees,
which
use
hippocampal
store
and
recall
locations
cached
food.
Chickadee
hippocampus
produces
sparse,
high-dimensional
patterns
(“barcodes”)
uniquely
specify
each
caching
event.
Unexpectedly,
the
same
participate
barcodes
also
exhibit
conventional
place
tuning.
It
unknown
how
barcode
activity
generated,
what
role
it
plays
formation
retrieval.
unclear
index
(e.g.
barcodes)
could
function
neural
population
represents
content
place).
Here,
we
design
biologically
plausible
generates
uses
them
experiential
content.
Our
from
inputs
through
chaotic
dynamics
recurrent
network
Hebbian
plasticity
as
attractor
states.
The
matches
experimental
observations
indices
(barcodes)
signals
(place
tuning)
are
randomly
intermixed
neurons.
We
demonstrate
reduce
interference
between
correlated
experiences.
show
tuning
complementary
barcodes,
enabling
flexible,
contextually-appropriate
Finally,
our
compatible
with
previous
models
generating
predictive
map.
Distinct
indexing
functions
achieved
via
adjustment
global
gain.
results
suggest
may
resolve
fundamental
tensions
specificity
(pattern
separation)
flexible
completion)
general
systems.
How
external/internal
‘state’
is
represented
in
the
brain
crucial,
since
appropriate
representation
enables
goal-directed
behavior.
Recent
studies
suggest
that
state
and
value
can
be
simultaneously
learnt
through
reinforcement
learning
(RL)
using
reward-prediction-error
recurrent-neural-network
(RNN)
its
downstream
weights.
However,
how
such
neurally
implemented
remains
unclear
because
training
of
RNN
‘backpropagation’
method
requires
weights,
which
are
biologically
unavailable
at
upstream
RNN.
Here
we
show
random
feedback
instead
weights
still
works
‘feedback
alignment’,
was
originally
demonstrated
for
supervised
learning.
We
further
if
constrained
to
non-negative,
occurs
without
alignment
non-negative
constraint
ensures
loose
alignment.
These
results
neural
mechanisms
RL
representation/value
power
biological
constraints.
How
external/internal
‘state’
is
represented
in
the
brain
crucial,
since
appropriate
representation
enables
goal-directed
behavior.
Recent
studies
suggest
that
state
and
value
can
be
simultaneously
learnt
through
reinforcement
learning
(RL)
using
reward-prediction-error
recurrent-neural-network
(RNN)
its
downstream
weights.
However,
how
such
neurally
implemented
remains
unclear
because
training
of
RNN
‘backpropagation’
method
requires
weights,
which
are
biologically
unavailable
at
upstream
RNN.
Here
we
show
random
feedback
instead
weights
still
works
‘feedback
alignment’,
was
originally
demonstrated
for
supervised
learning.
We
further
if
constrained
to
non-negative,
occurs
without
alignment
non-negative
constraint
ensures
loose
alignment.
These
results
neural
mechanisms
RL
representation/value
power
biological
constraints.
Hebbian
plasticity
has
long
dominated
neurobiological
models
of
memory
formation.
Yet,
rules
operating
on
one-shot
episodic
timescales
rarely
depend
both
pre-
and
postsynaptic
spiking,
challenging
theory
in
this
crucial
regime.
Here,
we
present
an
model
governed
by
a
simpler
rule
depending
only
presynaptic
activity.
We
show
that
rule,
capitalizing
high-dimensional
neural
activity
with
restricted
transitions,
naturally
stores
episodes
as
paths
through
complex
state
spaces
like
those
underlying
world
model.
The
resulting
traces,
which
term
path
vectors,
are
highly
expressive
decodable
odor-tracking
algorithm.
vectors
robust
alternatives
to
support
sequential
associative
recall,
along
policy
learning,
shed
light
specific
hippocampal
rules.
Thus,
non-Hebbian
is
sufficient
for
flexible
learning
well-suited
encode
policies
An
animal
entering
a
new
environment
typically
faces
three
challenges:
explore
the
space
for
resources,
memorize
their
locations,
and
navigate
towards
those
targets
as
needed.
Here
we
propose
neural
algorithm
that
can
solve
all
these
problems
operates
reliably
in
diverse
complex
environments.
At
its
core,
mechanism
makes
use
of
behavioral
module
common
to
motile
animals,
namely
ability
follow
an
odor
source.
We
show
how
brain
learn
generate
internal
"virtual
odors"
guide
any
location
interest.
This
endotaxis
be
implemented
with
simple
3-layer
circuit
using
only
biologically
realistic
structures
learning
rules.
Several
components
this
scheme
are
found
brains
from
insects
humans.
Nature
may
have
evolved
general
search
navigation
on
ancient
backbone
chemotaxis.
Frontiers in Neuroscience,
Год журнала:
2023,
Номер
17
Опубликована: Сен. 5, 2023
For
adaptive
real-time
behavior
in
real-world
contexts,
the
brain
needs
to
allow
past
information
over
multiple
timescales
influence
current
processing
for
making
choices
that
create
best
outcome
as
a
person
goes
about
their
everyday
life.
The
neuroeconomics
literature
on
value-based
decision-making
has
formalized
such
choice
through
reinforcement
learning
models
two
extreme
strategies.
These
strategies
are
model-free
(MF),
which
is
an
automatic,
stimulus–response
type
of
action,
and
model-based
(MB),
bases
cognitive
representations
world
causal
inference
environment-behavior
structure.
emphasis
examining
neural
substrates
decision
been
striatum
prefrontal
regions,
especially
with
regards
“here
now”
decision-making.
Yet,
dichotomy
does
not
embrace
all
dynamic
complexity
involved.
In
addition,
despite
robust
research
role
hippocampus
memory
spatial
learning,
its
contribution
just
starting
be
explored.
This
paper
aims
better
appreciate
advance
successor
representation
(SR)
candidate
mechanism
encoding
state
hippocampus,
separate
from
reward
representations.
To
this
end,
we
review
relates
hippocampal
sequences
SR
showing
implementation
agents
improves
performance.
also
enables
perform
multiscale
temporal
biologically
plausible
manner.
Altogether,
articulate
framework
striatal
prefrontal-focused
account
mechanisms
underlying
various
time-related
concepts
self
cumulates
person’s
life
course.
Electronics,
Год журнала:
2023,
Номер
12(20), С. 4212 - 4212
Опубликована: Окт. 11, 2023
The
focus
of
this
study
is
to
investigate
the
impact
different
initialization
strategies
for
weight
matrix
Successor
Features
(SF)
on
learning
efficiency
and
convergence
in
Reinforcement
Learning
(RL)
agents.
Using
a
grid-world
paradigm,
we
compare
performance
RL
agents,
whose
SF
initialized
with
either
an
identity
matrix,
zero
or
randomly
generated
(using
Xavier,
He,
uniform
distribution
method).
Our
analysis
revolves
around
evaluating
metrics
such
as
value
error,
step
length,
PCA
Representation
(SR)
place
field,
distance
SR
matrices
between
results
demonstrate
that
agents
random
reach
optimal
field
faster
showcase
quicker
reduction
pointing
more
efficient
learning.
Furthermore,
these
also
exhibit
decrease
length
across
larger
environments.
provides
insights
into
neurobiological
interpretations
results,
their
implications
understanding
intelligence,
potential
future
research
directions.
These
findings
could
have
profound
artificial
particularly
design
algorithms.