What language models can tell us about learning adjectives DOI Open Access
Megan Gotowski, Forrest Davis

Proceedings of the Linguistic Society of America, Год журнала: 2024, Номер 9(1), С. 5693 - 5693

Опубликована: Май 15, 2024

It has been argued that language models (LMs) inform our knowledge of acquisition. While LMs are claimed to replicate aspects grammatical knowledge, it remains unclear how this translates acquisition directly. We ask if a model trained specifically on child-directed speech (CDS) is able capture adjectives. Ultimately, results reveal what the “learning” adjectives distributed in CDS, and not properties different adjective classes. highlighting ability learn distributional information, these findings suggest alone cannot explain children generalize beyond their input.

Язык: Английский

A hierarchy of linguistic predictions during natural language comprehension DOI Creative Commons
Micha Heilbron, Kristijan Armeni, Jan‐Mathijs Schoffelen

и другие.

Proceedings of the National Academy of Sciences, Год журнала: 2022, Номер 119(32)

Опубликована: Авг. 3, 2022

Understanding spoken language requires transforming ambiguous acoustic streams into a hierarchy of representations, from phonemes to meaning. It has been suggested that the brain uses prediction guide interpretation incoming input. However, role in processing remains disputed, with disagreement about both ubiquity and representational nature predictions. Here, we address issues by analyzing recordings participants listening audiobooks, using deep neural network (GPT-2) precisely quantify contextual First, establish responses words are modulated ubiquitous Next, disentangle model-based predictions distinct dimensions, revealing dissociable signatures syntactic category (parts speech), phonemes, semantics. Finally, show high-level (word) inform low-level (phoneme) predictions, supporting hierarchical predictive processing. Together, these results underscore processing, showing spontaneously predicts upcoming at multiple levels abstraction.

Язык: Английский

Процитировано

232

Dissociating language and thought in large language models DOI
Kyle Mahowald, Anna A. Ivanova, Idan Blank

и другие.

Trends in Cognitive Sciences, Год журнала: 2024, Номер 28(6), С. 517 - 540

Опубликована: Март 19, 2024

Язык: Английский

Процитировано

125

Large Language Models Demonstrate the Potential of Statistical Learning in Language DOI Open Access
Pablo Contreras Kallens, Ross Deans Kristensen‐McLachlan, Morten H. Christiansen

и другие.

Cognitive Science, Год журнала: 2023, Номер 47(3)

Опубликована: Фев. 25, 2023

Abstract To what degree can language be acquired from linguistic input alone? This question has vexed scholars for millennia and is still a major focus of debate in the cognitive science language. The complexity human hampered progress because studies language–especially those involving computational modeling–have only been able to deal with small fragments our skills. We suggest that most recent generation Large Language Models (LLMs) might finally provide tools determine empirically how much ability experience. LLMs are sophisticated deep learning architectures trained on vast amounts natural data, enabling them perform an impressive range tasks. argue that, despite their clear semantic pragmatic limitations, have already demonstrated human‐like grammatical without need built‐in grammar. Thus, while there learn about humans acquire use language, full‐fledged models scientists evaluate just far statistical take us explaining full

Язык: Английский

Процитировано

73

Language Model Behavior: A Comprehensive Survey DOI Creative Commons

Tyler A. Chang,

Benjamin Bergen

Computational Linguistics, Год журнала: 2023, Номер 50(1), С. 293 - 350

Опубликована: Ноя. 15, 2023

Abstract Transformer language models have received widespread public attention, yet their generated text is often surprising even to NLP researchers. In this survey, we discuss over 250 recent studies of English model behavior before task-specific fine-tuning. Language possess basic capabilities in syntax, semantics, pragmatics, world knowledge, and reasoning, but these are sensitive specific inputs surface features. Despite dramatic increases quality as scale hundreds billions parameters, the still prone unfactual responses, commonsense errors, memorized text, social biases. Many weaknesses can be framed over-generalizations or under-generalizations learned patterns text. We synthesize results highlight what currently known about large capabilities, thus providing a resource for applied work research adjacent fields that use models.

Язык: Английский

Процитировано

41

Wh-island Effects in Chinese DOI
Chen Xu

Linguistik aktuell, Год журнала: 2024, Номер unknown

Опубликована: Янв. 12, 2024

This book examines three controversial generalizations concerning wh-island effects in Chinese: argument and adjunct asymmetry, subject object D-linked non-D-linked asymmetry. Experiments under the factorial definition of island reveal that: (1) both wh-in-situ are sensitive to wh-island, displaying no asymmetry; (2) manifests a larger magnitude effects, whereas shows smaller size due confounding double name penalty, exhibiting special pattern (3) who-in-situ evince while what-in-situ demonstrate marginal Findings support theory covert wh-movement on interpretation Chinese wh-in-situ. The can be attributed violation locality principles during wh-feature movement. is primarily tailored for researchers interested study wh-questions generative linguistics broad sense.

Язык: Английский

Процитировано

12

A-maze of Natural Stories: Comprehension and surprisal in the Maze task DOI Creative Commons
Veronica Boyce, Roger Lévy

Glossa Psycholinguistics, Год журнала: 2023, Номер 2(1)

Опубликована: Апрель 11, 2023

Behavioral measures of word-by-word reading time provide experimental evidence to test theories language processing. A-maze is a recent method for measuring incremental sentence processing that can localize slowdowns related syntactic ambiguities in individual sentences. We adapted use on longer passages and tested it the Natural Stories corpus. Participants were able comprehend these text they read via Maze task. Moreover, task yielded useable reaction data with word predictability effects linearly surprisal, same pattern found other methods. Crucially, times show tight relationship properties current word, little spillover effects from previous words. This superior localization an advantage compared Overall, we expanded scope materials, thus theoretical questions, be studied

Язык: Английский

Процитировано

23

Why large language models are poor theories of human linguistic cognition: A reply to Piantadosi DOI Creative Commons
Roni Katzir

Biolinguistics, Год журнала: 2023, Номер 17

Опубликована: Дек. 15, 2023

In a recent manuscript entitled “Modern language models refute Chomsky’s approach to language”, Steven Piantadosi proposes that large such as GPT-3 can serve serious theories of human linguistic cognition. In fact, he maintains these are significantly better than proposals emerging from within generative linguistics. The present note explains why this claim is wrong.

Язык: Английский

Процитировано

14

How poor is the stimulus? Evaluating hierarchical generalization in neural networks trained on child-directed speech DOI Creative Commons

Aditya Yedetore,

Tal Linzen, Robert Frank

и другие.

Опубликована: Янв. 1, 2023

When acquiring syntax, children consistently choose hierarchical rules over competing non-hierarchical possibilities. Is this preference due to a learning bias for structure, or more general biases that interact with cues in children's linguistic input? We explore these possibilities by training LSTMs and Transformers - two types of neural networks without on data similar quantity content input: text from the CHILDES corpus. then evaluate what models have learned about English yes/no questions, phenomenon which structure is crucial. find that, though they perform well at capturing surface statistics child-directed speech (as measured perplexity), both model generalize way consistent an incorrect linear rule than correct rule. These results suggest human-like generalization alone requires stronger sequence-processing standard network architectures.

Язык: Английский

Процитировано

11

Surprisal From Language Models Can Predict ERPs in Processing Predicate-Argument Structures Only if Enriched by an Agent Preference Principle DOI Creative Commons
Eva Huber, Sebastian Sauppe, Arrate Isasi-Isasmendi

и другие.

Neurobiology of Language, Год журнала: 2023, Номер 5(1), С. 167 - 200

Опубликована: Сен. 7, 2023

Language models based on artificial neural networks increasingly capture key aspects of how humans process sentences. Most notably, model-based surprisals predict event-related potentials such as N400 amplitudes during parsing. Assuming that these represent realistic estimates human linguistic experience, their success in modeling language processing raises the possibility system relies no other principles than general architecture and sufficient input. Here, we test this hypothesis effects observed verb-final sentences German, Basque, Hindi. By stacking Bayesian generalised additive models, show that, each language, topographies region verb are best predicted when complemented by an Agent Preference principle transiently interprets initial role-ambiguous noun phrases agents, leading to reanalysis interpretation fails. Our findings demonstrate need for independently usage frequencies structural differences between languages. The has unequal force, however. Compared surprisal, its effect is weakest stronger Hindi, still Basque. This gradient correlated with extent which grammars allow unmarked NPs be patients, a feature boosts effects. We conclude gain more neurobiological plausibility incorporating Preference. Conversely, theories profit from surprisal addition like Preference, arguably have distinct evolutionary roots.

Язык: Английский

Процитировано

10

Characterizing English Preposing in PP constructions DOI Creative Commons
Christopher Potts

Journal of Linguistics, Год журнала: 2024, Номер unknown, С. 1 - 39

Опубликована: Окт. 8, 2024

The English Preposing in PP construction (PiPP; e.g., H appy though / as we were ) is extremely rare but displays an intricate set of stable syntactic properties. How do people become proficient with this despite such limited evidence? It tempting to posit innate learning mechanisms, present-day large language models seem learn represent PiPPs well, even employ only very general mechanisms and experience few instances the during training. This suggests alternative hypothesis on which knowledge more frequent constructions helps shape PiPPs. I seek make idea precise using model-theoretic syntax (MTS). In MTS, a grammar essentially constraints forms. context, can be seen arising from mix construction-specific general-purpose constraints, all inferable linguistic experience.

Язык: Английский

Процитировано

3