Analysis
of
existing
datasets
eye
movements
in
reading
is
a
valuable
tool
for
vocabulary
research
because
it
allows
researchers
to
examine
word
recognition
an
authentic
context.
We
argue
that
such
secondary
analysis
important
addition
new
experimental
studies
and
mega-studies
examines
real
text
rather
than
crammed
conditions
or
isolation.
Corpora
which
participants
read
long
texts
are
particularly
interesting
they
provide
rich
material
can
be
better
controlled
confounding
variables,
but
collection
small
data
sets
also
contains
more
variation
typically
possible
single
study.
discuss
the
considerations
take
into
account
when
dealing
with
movement
urge
colleagues
make
their
available
spirit
open
science
so
larger
database
built
quickly.
Open Mind,
Journal Year:
2024,
Volume and Issue:
8, P. 177 - 201
Published: Jan. 1, 2024
Abstract
Many
studies
of
human
language
processing
have
shown
that
readers
slow
down
at
less
frequent
or
predictable
words,
but
there
is
debate
about
whether
frequency
and
predictability
effects
reflect
separable
cognitive
phenomena:
are
operations
retrieve
words
from
the
mental
lexicon
based
on
sensory
cues
distinct
those
predict
upcoming
context?
Previous
evidence
for
a
frequency-predictability
dissociation
mostly
small
samples
(both
estimating
testing
their
behavior),
artificial
materials
(e.g.,
isolated
constructed
sentences),
implausible
modeling
assumptions
(discrete-time
dynamics,
linearity,
additivity,
constant
variance,
invariance
over
time),
which
raises
question:
do
dissociate
in
ordinary
comprehension,
such
as
story
reading?
This
study
leverages
recent
progress
open
data
computational
to
address
this
question
scale.
A
large
collection
naturalistic
reading
(six
datasets,
>2.2
M
datapoints)
analyzed
using
nonlinear
continuous-time
regression,
estimated
statistical
models
trained
more
than
currently
typical
psycholinguistics.
Despite
use
data,
strong
estimates,
flexible
regression
models,
results
converge
with
earlier
experimental
supporting
dissociable
additive
effects.
Journal of Memory and Language,
Journal Year:
2023,
Volume and Issue:
135, P. 104496 - 104496
Published: Dec. 19, 2023
Models
of
eye-movement
control
during
reading,
developed
largely
within
psychology,
usually
focus
on
visual,
attentional,
lexical,
and
motor
processes
but
neglect
post-lexical
language
processing;
by
contrast,
models
sentence
comprehension
processes,
psycholinguistics,
generally
only
processes.
We
present
a
model
that
combines
these
two
research
threads,
integrating
processing.
Developing
such
an
integrated
is
extremely
challenging
computationally
demanding,
integration
important
step
toward
complete
mathematical
natural
in
reading.
combine
the
SWIFT
(Seelig
et
al.,
2023)
with
key
components
Lewis
Vasishth
processing
(Lewis
Vasishth,
2005).
This
becomes
possible,
for
first
time,
due
part
to
recent
advances
successful
parameter
identification
dynamical
models,
which
allows
us
investigate
profile
log-likelihoods
individual
parameters.
fully
implemented
proof-of-concept
demonstrating
how
can
be
achieved;
our
approach
includes
Bayesian
inference
Markov
Chain
Monte
Carlo
(MCMC)
sampling
as
computational
tool.
The
Sentence-Processing
Eye-Movement
Activation-Coupled
Model
(SEAM)
successfully
reproduce
eye
movement
patterns
arise
similarity-based
interference
To
knowledge,
this
first-ever
process
linguistic
dependency
completion
comprehension.
In
future
work,
proof
concept
will
need
evaluated
using
comprehensive
set
benchmark
data.
Behavioral and Brain Sciences,
Journal Year:
2023,
Volume and Issue:
46
Published: Jan. 1, 2023
On
several
key
issues
we
agree
with
the
commentators.
Perhaps
most
importantly,
everyone
seems
to
that
psychology
has
an
important
role
play
in
building
better
models
of
human
vision,
and
(most)
agrees
(including
us)
deep
neural
networks
(DNNs)
will
modelling
vision
going
forward.
But
there
are
also
disagreements
about
what
for,
how
DNN-human
correspondences
should
be
evaluated,
value
alternative
approaches,
impact
marketing
hype
literature.
In
our
view,
these
latter
contributing
many
unjustified
claims
regarding
other
domains
cognition.
We
explore
all
this
response.
Many
studies
of
human
language
processing
have
shown
that
readers
slow
down
at
less
frequent
or
predictable
words,
but
there
is
debate
about
whether
frequency
and
predictability
effects
reflect
separable
cognitive
phenomena:
are
operations
retrieve
words
from
the
mental
lexicon
based
on
sensory
cues
distinct
those
predict
upcoming
context?
Previous
evidence
for
a
frequency-predictability
dissociation
mostly
small
samples
(both
estimating
testing
their
behavior),
artificial
materials
(e.g.,
isolated
constructed
sentences),
implausible
modeling
assumptions
(discrete-time
dynamics,
linearity,
additivity,
constant
variance,
invariance
over
time),
which
raises
question:
do
dissociate
in
ordinary
comprehension,
such
as
story
reading?
This
study
leverages
recent
progress
open
data
computational
to
address
this
question
scale.
A
large
collection
naturalistic
reading
(six
datasets,
>2.2M
datapoints)
analyzed
using
nonlinear
continuous-time
regression,
estimated
statistical
models
trained
more
than
currently
typical
psycholinguistics.
Despite
use
data,
strong
estimates,
flexible
regression
models,
results
converge
with
earlier
experimental
supporting
dissociable
additive
effects.
arXiv (Cornell University),
Journal Year:
2023,
Volume and Issue:
unknown
Published: Jan. 1, 2023
Models
of
eye-movement
control
during
reading,
developed
largely
within
psychology,
usually
focus
on
visual,
attentional,
lexical,
and
motor
processes
but
neglect
post-lexical
language
processing;
by
contrast,
models
sentence
comprehension
processes,
psycholinguistics,
generally
only
processes.
We
present
a
model
that
combines
these
two
research
threads,
integrating
processing.
Developing
such
an
integrated
is
extremely
challenging
computationally
demanding,
integration
important
step
toward
complete
mathematical
natural
in
reading.
combine
the
SWIFT
(Seelig
et
al.,
2020,
doi:10.1016/j.jmp.2019.102313)
with
key
components
Lewis
Vasishth
processing
(Lewis
&
Vasishth,
2005,
doi:10.1207/s15516709cog0000_25).
This
becomes
possible,
for
first
time,
due
part
to
recent
advances
successful
parameter
identification
dynamical
models,
which
allows
us
investigate
profile
log-likelihoods
individual
parameters.
fully
implemented
proof-of-concept
demonstrating
how
can
be
achieved;
our
approach
includes
Bayesian
inference
Markov
Chain
Monte
Carlo
(MCMC)
sampling
as
computational
tool.
The
Sentence-Processing
Eye-Movement
Activation-Coupled
Model
(SEAM)
successfully
reproduce
eye
movement
patterns
arise
similarity-based
interference
To
knowledge,
this
first-ever
process
linguistic
dependency
completion
comprehension.
In
future
work,
proof
concept
will
need
evaluated
using
comprehensive
set
benchmark
data.
The
use
of
neural
language
models
to
model
human
behavior
has
met
with
mixed
success.While
some
work
found
that
the
surprisal
estimates
from
these
can
be
used
predict
a
wide
range
and
behavioral
responses,
other
studying
more
complex
syntactic
phenomena
generate
incorrect
predictions.This
paper
explores
extent
which
misalignment
between
empirical
model-predicted
minimized
by
training
on
developmentally
plausible
data,
such
as
in
BabyLM
Challenge.We
trained
teacher
"strict-small"
dataset
sentence
level
create
curriculum.We
tentative
evidence
our
curriculum
made
it
easier
for
acquire
linguistic
knowledge
data:
subset
tasks
challenge
suite
evaluating
models'
grammatical
English,
first
data
then
few
randomly
ordered
epochs
performed
slightly
better
than
alone.This
improved
acquisition
did
not
result
alignment
reading
behavior,
however:
(with
or
without
curriculum)
generated
predictions
were
misaligned
larger
less
curated
datasets.This
suggests
datasets
alone
is
likely
insufficient
capable
accurately
predicting
processing.
Philosophical Transactions of the Royal Society A Mathematical Physical and Engineering Sciences,
Journal Year:
2024,
Volume and Issue:
382(2268)
Published: Jan. 29, 2024
Sheaves
are
mathematical
objects
that
describe
the
globally
compatible
data
associated
with
open
sets
of
a
topological
space.
Original
examples
sheaves
were
continuous
functions;
later
they
also
became
powerful
tools
in
algebraic
geometry,
as
well
logic
and
set
theory.
More
recently,
have
been
applied
to
theory
contextuality
quantum
mechanics.
Whenever
local
not
necessarily
compatible,
replaced
by
simpler
setting
presheaves.
In
previous
work,
we
used
presheaves
model
lexically
ambiguous
phrases
natural
language
identified
order
their
disambiguation.
work
presented
here,
syntactic
ambiguities
study
phenomenon
human
parsing
called
garden-pathing.
It
has
shown
information-theoretic
quantity
known
‘surprisal’
correlates
reading
times
but
fails
do
so
garden-path
sentences.
We
compute
degree
signalling
our
using
probabilities
from
large
BERT
evaluate
predictions
on
two
psycholinguistic
datasets.
Our
outperforms
surprisal
ways:
(i)
it
distinguishes
between
hard
easy
sentences
(with
p
-value
<10−5
),
whereas
existing
could
not,
(ii)
its
effect
is
larger
one
datasets
(32
ms
versus
8.75
per
word),
leading
better
prediction
accuracies.
This
article
part
theme
issue
‘Quantum
contextuality,
causality
freedom
choice’.
We
investigated
the
mechanism
of
lingering
effect
in
relation
to
garden-path
based
on
self-paced
reading
and
comprehension
experiments
Japanese.
The
refers
a
phenomenon
which
an
initial
misinterpretation
persists
final
even
after
disambiguation,
occurs
sentence.
Throughself-paced
(Experiment
1)
tasks
(Experiments
2
3),
we
explored
how
length
head
position
ambiguous
regions
influence
effects.
Our
results
indicated
that
influenced
effects
different
ways.
Consequently,
longer
misparsestrengthened
but
weakened
effect.
Additionally,
Surprisal
affected
not
These
support
notion
are
correlated
operate
through
underlying
processes.
Specifically,
pertains
parsing,
while
relates
short-term
memory.
Computational Linguistics,
Journal Year:
2024,
Volume and Issue:
unknown, P. 1241 - 1290
Published: July 30, 2024
Abstract
The
staggering
pace
with
which
the
capabilities
of
large
language
models
(LLMs)
are
increasing,
as
measured
by
a
range
commonly
used
natural
understanding
(NLU)
benchmarks,
raises
many
questions
regarding
what
“understanding”
means
for
model
and
how
it
compares
to
human
understanding.
This
is
especially
true
since
LLMs
exclusively
trained
on
text,
casting
doubt
whether
their
stellar
benchmark
performances
reflective
problems
represented
these
or
simply
excel
at
uttering
textual
forms
that
correlate
someone
who
understands
problem
would
say.
In
this
philosophically
inspired
work,
we
aim
create
some
separation
between
form
meaning,
series
tests
leverage
idea
world
should
be
consistent
across
presentational
modes—inspired
Fregean
senses—of
same
meaning.
Specifically,
focus
consistency
languages
well
paraphrases.
Taking
GPT-3.5
our
object
study,
evaluate
multisense
five
different
various
tasks.
We
start
evaluation
in
controlled
setting,
asking
simple
facts,
then
proceed
an
four
popular
NLU
benchmarks.
find
model’s
lacking
run
several
follow-up
analyses
verify
lack
due
sense-dependent
task
conclude
that,
aspect,
still
quite
far
from
being
human-like,
deliberate
impacts
utility
context
learning
about
How
does
grammatical
markedness
affect
processing?
Previous
work
has
studied
this
extensively
in
the
domain
of
experiencer
verbs,
by
examining
question
alignment
between
thematic
role
and
function
hierarchies.
Existing
evidence
is
consistent
with
multiple
accounts,
including
an
experiencer-first
preference
or
experiencer-subject
preference.
We
conducted
two
experiments
to
disentangle
these
effects
for
using
self-paced
reading
comprehension
questions
speeded
grammaticality
judgments.
Our
results
revealed
a
clear
preference,
no
processing
English.
These
are
most
view
where
constraints
cannot
be
reduced
more
general
cognitive
constraints.