Proceedings of the Linguistic Society of America,
Год журнала:
2024,
Номер
9(1), С. 5693 - 5693
Опубликована: Май 15, 2024
It
has
been
argued
that
language
models
(LMs)
inform
our
knowledge
of
acquisition.
While
LMs
are
claimed
to
replicate
aspects
grammatical
knowledge,
it
remains
unclear
how
this
translates
acquisition
directly.
We
ask
if
a
model
trained
specifically
on
child-directed
speech
(CDS)
is
able
capture
adjectives.
Ultimately,
results
reveal
what
the
“learning”
adjectives
distributed
in
CDS,
and
not
properties
different
adjective
classes.
highlighting
ability
learn
distributional
information,
these
findings
suggest
alone
cannot
explain
children
generalize
beyond
their
input.
Proceedings of the National Academy of Sciences,
Год журнала:
2022,
Номер
119(32)
Опубликована: Авг. 3, 2022
Understanding
spoken
language
requires
transforming
ambiguous
acoustic
streams
into
a
hierarchy
of
representations,
from
phonemes
to
meaning.
It
has
been
suggested
that
the
brain
uses
prediction
guide
interpretation
incoming
input.
However,
role
in
processing
remains
disputed,
with
disagreement
about
both
ubiquity
and
representational
nature
predictions.
Here,
we
address
issues
by
analyzing
recordings
participants
listening
audiobooks,
using
deep
neural
network
(GPT-2)
precisely
quantify
contextual
First,
establish
responses
words
are
modulated
ubiquitous
Next,
disentangle
model-based
predictions
distinct
dimensions,
revealing
dissociable
signatures
syntactic
category
(parts
speech),
phonemes,
semantics.
Finally,
show
high-level
(word)
inform
low-level
(phoneme)
predictions,
supporting
hierarchical
predictive
processing.
Together,
these
results
underscore
processing,
showing
spontaneously
predicts
upcoming
at
multiple
levels
abstraction.
Abstract
To
what
degree
can
language
be
acquired
from
linguistic
input
alone?
This
question
has
vexed
scholars
for
millennia
and
is
still
a
major
focus
of
debate
in
the
cognitive
science
language.
The
complexity
human
hampered
progress
because
studies
language–especially
those
involving
computational
modeling–have
only
been
able
to
deal
with
small
fragments
our
skills.
We
suggest
that
most
recent
generation
Large
Language
Models
(LLMs)
might
finally
provide
tools
determine
empirically
how
much
ability
experience.
LLMs
are
sophisticated
deep
learning
architectures
trained
on
vast
amounts
natural
data,
enabling
them
perform
an
impressive
range
tasks.
argue
that,
despite
their
clear
semantic
pragmatic
limitations,
have
already
demonstrated
human‐like
grammatical
without
need
built‐in
grammar.
Thus,
while
there
learn
about
humans
acquire
use
language,
full‐fledged
models
scientists
evaluate
just
far
statistical
take
us
explaining
full
Computational Linguistics,
Год журнала:
2023,
Номер
50(1), С. 293 - 350
Опубликована: Ноя. 15, 2023
Abstract
Transformer
language
models
have
received
widespread
public
attention,
yet
their
generated
text
is
often
surprising
even
to
NLP
researchers.
In
this
survey,
we
discuss
over
250
recent
studies
of
English
model
behavior
before
task-specific
fine-tuning.
Language
possess
basic
capabilities
in
syntax,
semantics,
pragmatics,
world
knowledge,
and
reasoning,
but
these
are
sensitive
specific
inputs
surface
features.
Despite
dramatic
increases
quality
as
scale
hundreds
billions
parameters,
the
still
prone
unfactual
responses,
commonsense
errors,
memorized
text,
social
biases.
Many
weaknesses
can
be
framed
over-generalizations
or
under-generalizations
learned
patterns
text.
We
synthesize
results
highlight
what
currently
known
about
large
capabilities,
thus
providing
a
resource
for
applied
work
research
adjacent
fields
that
use
models.
Linguistik aktuell,
Год журнала:
2024,
Номер
unknown
Опубликована: Янв. 12, 2024
This
book
examines
three
controversial
generalizations
concerning
wh-island
effects
in
Chinese:
argument
and
adjunct
asymmetry,
subject
object
D-linked
non-D-linked
asymmetry.
Experiments
under
the
factorial
definition
of
island
reveal
that:
(1)
both
wh-in-situ
are
sensitive
to
wh-island,
displaying
no
asymmetry;
(2)
manifests
a
larger
magnitude
effects,
whereas
shows
smaller
size
due
confounding
double
name
penalty,
exhibiting
special
pattern
(3)
who-in-situ
evince
while
what-in-situ
demonstrate
marginal
Findings
support
theory
covert
wh-movement
on
interpretation
Chinese
wh-in-situ.
The
can
be
attributed
violation
locality
principles
during
wh-feature
movement.
is
primarily
tailored
for
researchers
interested
study
wh-questions
generative
linguistics
broad
sense.
Glossa Psycholinguistics,
Год журнала:
2023,
Номер
2(1)
Опубликована: Апрель 11, 2023
Behavioral
measures
of
word-by-word
reading
time
provide
experimental
evidence
to
test
theories
language
processing.
A-maze
is
a
recent
method
for
measuring
incremental
sentence
processing
that
can
localize
slowdowns
related
syntactic
ambiguities
in
individual
sentences.
We
adapted
use
on
longer
passages
and
tested
it
the
Natural
Stories
corpus.
Participants
were
able
comprehend
these
text
they
read
via
Maze
task.
Moreover,
task
yielded
useable
reaction
data
with
word
predictability
effects
linearly
surprisal,
same
pattern
found
other
methods.
Crucially,
times
show
tight
relationship
properties
current
word,
little
spillover
effects
from
previous
words.
This
superior
localization
an
advantage
compared
Overall,
we
expanded
scope
materials,
thus
theoretical
questions,
be
studied
In
a
recent
manuscript
entitled
“Modern
language
models
refute
Chomsky’s
approach
to
language”,
Steven
Piantadosi
proposes
that
large
such
as
GPT-3
can
serve
serious
theories
of
human
linguistic
cognition.
In
fact,
he
maintains
these
are
significantly
better
than
proposals
emerging
from
within
generative
linguistics.
The
present
note
explains
why
this
claim
is
wrong.
When
acquiring
syntax,
children
consistently
choose
hierarchical
rules
over
competing
non-hierarchical
possibilities.
Is
this
preference
due
to
a
learning
bias
for
structure,
or
more
general
biases
that
interact
with
cues
in
children's
linguistic
input?
We
explore
these
possibilities
by
training
LSTMs
and
Transformers
-
two
types
of
neural
networks
without
on
data
similar
quantity
content
input:
text
from
the
CHILDES
corpus.
then
evaluate
what
models
have
learned
about
English
yes/no
questions,
phenomenon
which
structure
is
crucial.
find
that,
though
they
perform
well
at
capturing
surface
statistics
child-directed
speech
(as
measured
perplexity),
both
model
generalize
way
consistent
an
incorrect
linear
rule
than
correct
rule.
These
results
suggest
human-like
generalization
alone
requires
stronger
sequence-processing
standard
network
architectures.
Neurobiology of Language,
Год журнала:
2023,
Номер
5(1), С. 167 - 200
Опубликована: Сен. 7, 2023
Language
models
based
on
artificial
neural
networks
increasingly
capture
key
aspects
of
how
humans
process
sentences.
Most
notably,
model-based
surprisals
predict
event-related
potentials
such
as
N400
amplitudes
during
parsing.
Assuming
that
these
represent
realistic
estimates
human
linguistic
experience,
their
success
in
modeling
language
processing
raises
the
possibility
system
relies
no
other
principles
than
general
architecture
and
sufficient
input.
Here,
we
test
this
hypothesis
effects
observed
verb-final
sentences
German,
Basque,
Hindi.
By
stacking
Bayesian
generalised
additive
models,
show
that,
each
language,
topographies
region
verb
are
best
predicted
when
complemented
by
an
Agent
Preference
principle
transiently
interprets
initial
role-ambiguous
noun
phrases
agents,
leading
to
reanalysis
interpretation
fails.
Our
findings
demonstrate
need
for
independently
usage
frequencies
structural
differences
between
languages.
The
has
unequal
force,
however.
Compared
surprisal,
its
effect
is
weakest
stronger
Hindi,
still
Basque.
This
gradient
correlated
with
extent
which
grammars
allow
unmarked
NPs
be
patients,
a
feature
boosts
effects.
We
conclude
gain
more
neurobiological
plausibility
incorporating
Preference.
Conversely,
theories
profit
from
surprisal
addition
like
Preference,
arguably
have
distinct
evolutionary
roots.
Journal of Linguistics,
Год журнала:
2024,
Номер
unknown, С. 1 - 39
Опубликована: Окт. 8, 2024
The
English
Preposing
in
PP
construction
(PiPP;
e.g.,
H
appy
though
/
as
we
were
)
is
extremely
rare
but
displays
an
intricate
set
of
stable
syntactic
properties.
How
do
people
become
proficient
with
this
despite
such
limited
evidence?
It
tempting
to
posit
innate
learning
mechanisms,
present-day
large
language
models
seem
learn
represent
PiPPs
well,
even
employ
only
very
general
mechanisms
and
experience
few
instances
the
during
training.
This
suggests
alternative
hypothesis
on
which
knowledge
more
frequent
constructions
helps
shape
PiPPs.
I
seek
make
idea
precise
using
model-theoretic
syntax
(MTS).
In
MTS,
a
grammar
essentially
constraints
forms.
context,
can
be
seen
arising
from
mix
construction-specific
general-purpose
constraints,
all
inferable
linguistic
experience.