Meaning Beyond Lexicality: Capturing Pseudoword Definitions with Language Models DOI Open Access
Andrea Gregor de Varda, Daniele Gatti, Marco Marelli

et al.

Published: June 3, 2024

Pseudowords such as “knackets” or “spechy” – letter strings that are consistent with the orthotactical rules of a language but do not appear in its lexicon traditionally considered to be meaningless, and employed empirical studies. However, recent studies show specific semantic patterns associated these words well effects on human pseudoword processing have cast doubt this view. While suggest pseudowords meanings, they provide only extremely limited insight whether humans able ascribe explicit declarative content unfamiliar word forms. In present study, we an exploratory-confirmatory study design examine question. first exploratory started from pre-existing dataset alongside human-generated definitions for items. Employing 18 different models, showed actually produced (pseudo)words were closer their respective than other Based initial results, conducted second, pre-registered, high-powered confirmatory collecting new, controlled set (pseudo)word interpretations. This second confirmed results one. Taken together, findings support idea meaning construction is supported by flexible form-to-meaning mapping system based statistical regularities environment can accommodate novel lexical entries soon encountered.

Language: Английский

Driving and suppressing the human language network using large language models DOI
Greta Tuckute, Aalok Sathe, Shashank Srikant

et al.

Nature Human Behaviour, Journal Year: 2024, Volume and Issue: 8(3), P. 544 - 561

Published: Jan. 3, 2024

Language: Английский

Citations

33

Prediction during language comprehension: what is next? DOI Creative Commons
Rachel Ryskin, Mante S. Nieuwland

Trends in Cognitive Sciences, Journal Year: 2023, Volume and Issue: 27(11), P. 1032 - 1052

Published: Sept. 11, 2023

Prediction is often regarded as an integral aspect of incremental language comprehension, but little known about the cognitive architectures and mechanisms that support it. We review studies showing listeners readers use all manner contextual information to generate multifaceted predictions upcoming input. The nature these may vary between individuals owing differences in experience, among other factors. then turn unresolved questions which guide search for underlying mechanisms. (i) Is prediction essential processing or optional strategy? (ii) Are generated from within system by domain-general processes? (iii) What relationship memory? (iv) Does comprehension require simulation via production system? discuss promising directions making progress answering developing a mechanistic understanding language.

Language: Английский

Citations

41

Event Knowledge in Large Language Models: The Gap Between the Impossible and the Unlikely DOI Creative Commons
Carina Kauf, Anna A. Ivanova, Giulia Rambelli

et al.

Cognitive Science, Journal Year: 2023, Volume and Issue: 47(11)

Published: Nov. 1, 2023

Abstract Word co‐occurrence patterns in language corpora contain a surprising amount of conceptual knowledge. Large models (LLMs), trained to predict words context, leverage these achieve impressive performance on diverse semantic tasks requiring world An important but understudied question about LLMs’ abilities is whether they acquire generalized knowledge common events. Here, we test five pretrained LLMs (from 2018's BERT 2023's MPT) assign higher likelihood plausible descriptions agent−patient interactions than minimally different implausible versions the same event. Using three curated sets minimal sentence pairs (total n = 1215), found that possess substantial event knowledge, outperforming other distributional models. In particular, almost always possible versus impossible events ( The teacher bought laptop vs. ). However, show less consistent preferences for likely unlikely nanny tutored boy follow‐up analyses, (i) LLM scores are driven by both plausibility and surface‐level features, (ii) generalize well across syntactic variants (active passive constructions) (synonymous sentences), (iii) some errors mirror human judgment ambiguity, (iv) serves as an organizing dimension internal representations. Overall, our results aspects naturally emerge from linguistic patterns, also highlight gap between representations possible/impossible likely/unlikely

Language: Английский

Citations

34

Testing the Predictions of Surprisal Theory in 11 Languages DOI Creative Commons

Ethan G. Wilcox,

Tiago Pimentel, Clara Meister

et al.

Transactions of the Association for Computational Linguistics, Journal Year: 2023, Volume and Issue: 11, P. 1451 - 1470

Published: Jan. 1, 2023

Abstract Surprisal theory posits that less-predictable words should take more time to process, with word predictability quantified as surprisal, i.e., negative log probability in context. While evidence supporting the predictions of surprisal has been replicated widely, much it focused on a very narrow slice data: native English speakers reading texts. Indeed, no comprehensive multilingual analysis exists. We address this gap current literature by investigating relationship between and times eleven different languages, distributed across five language families. Deriving estimates from models trained monolingual corpora, we test three associated theory: (i) whether is predictive times, (ii) expected contextual entropy, (iii) linking function linear. find all are borne out crosslinguistically. By focusing diverse set argue these results offer most robust link date information incremental processing languages.

Language: Английский

Citations

29

Large-scale benchmark yields no evidence that language model surprisal explains syntactic disambiguation difficulty DOI Creative Commons
Kuan‐Jung Huang,

Suhas Arehalli,

Mari Kugemoto

et al.

Journal of Memory and Language, Journal Year: 2024, Volume and Issue: 137, P. 104510 - 104510

Published: Feb. 28, 2024

Language: Английский

Citations

9

Driving and suppressing the human language network using large language models DOI Creative Commons
Greta Tuckute, Aalok Sathe, Shashank Srikant

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2023, Volume and Issue: unknown

Published: April 16, 2023

Transformer models such as GPT generate human-like language and are highly predictive of human brain responses to language. Here, using fMRI-measured 1,000 diverse sentences, we first show that a GPT-based encoding model can predict the magnitude response associated with each sentence. Then, use identify new sentences predicted drive or suppress in network. We these model-selected novel indeed strongly activity areas individuals. A systematic analysis reveals surprisal well-formedness linguistic input key determinants strength These results establish ability neural network not only mimic but also noninvasively control higher-level cortical areas, like

Language: Английский

Citations

14

The Plausibility of Sampling as an Algorithmic Theory of Sentence Processing DOI Creative Commons
Jacob Louis Hoover, Morgan Sonderegger, Steven T. Piantadosi

et al.

Open Mind, Journal Year: 2023, Volume and Issue: unknown, P. 1 - 42

Published: June 1, 2023

Words that are more surprising given context take longer to process. However, no incremental parsing algorithm has been shown directly predict this phenomenon. In work, we focus on a class of algorithms whose runtime does naturally scale in surprisal-those involve repeatedly sampling from the prior. Our first contribution is show simple examples such increase superlinearly with surprisal, and also variance increase. These two predictions stand contrast literature surprisal theory (Hale, 2001; Levy, 2008a) which assumes expected processing cost increases linearly makes prediction about variance. second part paper, conduct an empirical study relationship between reading time, using collection modern language models estimate surprisal. We find better models, time increases. results consistent sampling-based algorithms.

Language: Английский

Citations

13

Surprisal does not explain syntactic disambiguation difficulty: evidence from a large-scale benchmark DOI Open Access
Kuan‐Jung Huang,

Suhas Arehalli,

Mari Kugemoto

et al.

Published: April 21, 2023

Prediction has been proposed as an overarching principle that explains human information processing in language and beyond. To what degree can difficulty syntactically complex sentences - one of the major concerns psycholinguistics be explained by predictability, estimated using computational models? A precise, quantitative test this question requires a much larger scale data collection effort than done past. We present Syntactic Ambiguity Processing Benchmark, dataset self-paced reading times from 2000 participants, who read diverse set English sentences. This makes it possible to measure associated with individual syntactic constructions, even sentences, precisely enough rigorously predictions models comprehension. find two different architectures sharply diverge time data, dramatically underpredicting difficulty, failing predict relative among ambiguous only partially explaining item-wise variability. These findings suggest prediction is most likely insufficient on its own explain processing.

Language: Английский

Citations

12

Scaling in Cognitive Modelling: a Multilingual Approach to Human Reading Times DOI Creative Commons
Andrea Gregor de Varda, Marco Marelli

Published: Jan. 1, 2023

Neural language models are increasingly valued in computational psycholinguistics, due to their ability provide conditional probability distributions over the lexicon that predictive of human processing times. Given vast array available models, it is both theoretical and methodological importance assess what features a model influence its psychometric quality. In this work we focus on parameter size, showing larger Transformer-based generate probabilistic estimates less early eye-tracking measurements reflecting lexical access semantic integration. However, relatively bigger show an advantage capturing late reflect full syntactic integration word into current context. Our results supported by eye movement data ten languages consider four spanning from 564M 4.5B parameters.

Language: Английский

Citations

12

Lexical Processing Strongly Affects Reading Times But Not Skipping During Natural Reading DOI Creative Commons
Micha Heilbron, Jorie van Haren, Peter Hagoort

et al.

Open Mind, Journal Year: 2023, Volume and Issue: 7, P. 757 - 783

Published: Jan. 1, 2023

Abstract In a typical text, readers look much longer at some words than others, even skipping many altogether. Historically, researchers explained this variation via low-level visual or oculomotor factors, but today it is primarily factors determining word’s lexical processing ease, such as how well word identity can be predicted from context discerned parafoveal preview. While the existence of these effects established in controlled experiments, relative importance prediction, preview and natural reading remains unclear. Here, we address question three large naturalistic corpora (n = 104, 1.5 million words), using deep neural networks Bayesian ideal observers to model linguistic prediction moment reading. Strikingly, neither nor was important for explaining skipping—the vast majority by simple model, just fixation position length. For times, contrast, found strong independent contributions preview, with effect sizes matching those experiments. Together, results challenge dominant models eye movements reading, instead support alternative that describe (but not times) largely autonomous identification, mostly determined information.

Language: Английский

Citations

10