Cited by The Linearity of the Effect of Surprisal on Reading Times across Languages

Driving and suppressing the human language network using large language models DOI

Greta Tuckute, Aalok Sathe, Shashank Srikant

и другие.

Nature Human Behaviour, Год журнала: 2024, Номер 8(3), С. 544 - 561

Опубликована: Янв. 3, 2024

Язык: Английский

Процитировано

Prediction during language comprehension: what is next? DOI

Rachel Ryskin, Mante S. Nieuwland

Trends in Cognitive Sciences, Год журнала: 2023, Номер 27(11), С. 1032 - 1052

Опубликована: Сен. 11, 2023

Prediction is often regarded as an integral aspect of incremental language comprehension, but little known about the cognitive architectures and mechanisms that support it. We review studies showing listeners readers use all manner contextual information to generate multifaceted predictions upcoming input. The nature these may vary between individuals owing differences in experience, among other factors. then turn unresolved questions which guide search for underlying mechanisms. (i) Is prediction essential processing or optional strategy? (ii) Are generated from within system by domain-general processes? (iii) What relationship memory? (iv) Does comprehension require simulation via production system? discuss promising directions making progress answering developing a mechanistic understanding language.

Язык: Английский

Процитировано

Event Knowledge in Large Language Models: The Gap Between the Impossible and the Unlikely DOI

Carina Kauf, Anna A. Ivanova, Giulia Rambelli

и другие.

Cognitive Science, Год журнала: 2023, Номер 47(11)

Опубликована: Ноя. 1, 2023

Abstract Word co‐occurrence patterns in language corpora contain a surprising amount of conceptual knowledge. Large models (LLMs), trained to predict words context, leverage these achieve impressive performance on diverse semantic tasks requiring world An important but understudied question about LLMs’ abilities is whether they acquire generalized knowledge common events. Here, we test five pretrained LLMs (from 2018's BERT 2023's MPT) assign higher likelihood plausible descriptions agent−patient interactions than minimally different implausible versions the same event. Using three curated sets minimal sentence pairs (total n = 1215), found that possess substantial event knowledge, outperforming other distributional models. In particular, almost always possible versus impossible events ( The teacher bought laptop vs. ). However, show less consistent preferences for likely unlikely nanny tutored boy follow‐up analyses, (i) LLM scores are driven by both plausibility and surface‐level features, (ii) generalize well across syntactic variants (active passive constructions) (synonymous sentences), (iii) some errors mirror human judgment ambiguity, (iv) serves as an organizing dimension internal representations. Overall, our results aspects naturally emerge from linguistic patterns, also highlight gap between representations possible/impossible likely/unlikely

Язык: Английский

Процитировано

Testing the Predictions of Surprisal Theory in 11 Languages DOI

Ethan G. Wilcox,

Tiago Pimentel, Clara Meister

и другие.

Transactions of the Association for Computational Linguistics, Год журнала: 2023, Номер 11, С. 1451 - 1470

Опубликована: Янв. 1, 2023

Abstract Surprisal theory posits that less-predictable words should take more time to process, with word predictability quantified as surprisal, i.e., negative log probability in context. While evidence supporting the predictions of surprisal has been replicated widely, much it focused on a very narrow slice data: native English speakers reading texts. Indeed, no comprehensive multilingual analysis exists. We address this gap current literature by investigating relationship between and times eleven different languages, distributed across five language families. Deriving estimates from models trained monolingual corpora, we test three associated theory: (i) whether is predictive times, (ii) expected contextual entropy, (iii) linking function linear. find all are borne out crosslinguistically. By focusing diverse set argue these results offer most robust link date information incremental processing languages.

Язык: Английский

Процитировано

Large-scale benchmark yields no evidence that language model surprisal explains syntactic disambiguation difficulty DOI

Kuan‐Jung Huang,

Suhas Arehalli,

Mari Kugemoto

и другие.

Journal of Memory and Language, Год журнала: 2024, Номер 137, С. 104510 - 104510

Опубликована: Фев. 28, 2024

Язык: Английский

Процитировано

The Plausibility of Sampling as an Algorithmic Theory of Sentence Processing DOI

Jacob Louis Hoover, Morgan Sonderegger, Steven T. Piantadosi

и другие.

Open Mind, Год журнала: 2023, Номер unknown, С. 1 - 42

Опубликована: Июнь 1, 2023

Words that are more surprising given context take longer to process. However, no incremental parsing algorithm has been shown directly predict this phenomenon. In work, we focus on a class of algorithms whose runtime does naturally scale in surprisal-those involve repeatedly sampling from the prior. Our first contribution is show simple examples such increase superlinearly with surprisal, and also variance increase. These two predictions stand contrast literature surprisal theory (Hale, 2001; Levy, 2008a) which assumes expected processing cost increases linearly makes prediction about variance. second part paper, conduct an empirical study relationship between reading time, using collection modern language models estimate surprisal. We find better models, time increases. results consistent sampling-based algorithms.

Язык: Английский

Процитировано

Driving and suppressing the human language network using large language models DOI

Greta Tuckute, Aalok Sathe, Shashank Srikant

и другие.

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2023, Номер unknown

Опубликована: Апрель 16, 2023

Transformer models such as GPT generate human-like language and are highly predictive of human brain responses to language. Here, using fMRI-measured 1,000 diverse sentences, we first show that a GPT-based encoding model can predict the magnitude response associated with each sentence. Then, use identify new sentences predicted drive or suppress in network. We these model-selected novel indeed strongly activity areas individuals. A systematic analysis reveals surprisal well-formedness linguistic input key determinants strength These results establish ability neural network not only mimic but also noninvasively control higher-level cortical areas, like

Язык: Английский

Процитировано

Surprisal does not explain syntactic disambiguation difficulty: evidence from a large-scale benchmark DOI

Kuan‐Jung Huang,

Suhas Arehalli,

Mari Kugemoto

и другие.

Опубликована: Апрель 21, 2023

Prediction has been proposed as an overarching principle that explains human information processing in language and beyond. To what degree can difficulty syntactically complex sentences - one of the major concerns psycholinguistics be explained by predictability, estimated using computational models? A precise, quantitative test this question requires a much larger scale data collection effort than done past. We present Syntactic Ambiguity Processing Benchmark, dataset self-paced reading times from 2000 participants, who read diverse set English sentences. This makes it possible to measure associated with individual syntactic constructions, even sentences, precisely enough rigorously predictions models comprehension. find two different architectures sharply diverge time data, dramatically underpredicting difficulty, failing predict relative among ambiguous only partially explaining item-wise variability. These findings suggest prediction is most likely insufficient on its own explain processing.

Язык: Английский

Процитировано

Scaling in Cognitive Modelling: a Multilingual Approach to Human Reading Times DOI

Andrea Gregor de Varda, Marco Marelli

Опубликована: Янв. 1, 2023

Neural language models are increasingly valued in computational psycholinguistics, due to their ability provide conditional probability distributions over the lexicon that predictive of human processing times. Given vast array available models, it is both theoretical and methodological importance assess what features a model influence its psychometric quality. In this work we focus on parameter size, showing larger Transformer-based generate probabilistic estimates less early eye-tracking measurements reflecting lexical access semantic integration. However, relatively bigger show an advantage capturing late reflect full syntactic integration word into current context. Our results supported by eye movement data ten languages consider four spanning from 564M 4.5B parameters.

Язык: Английский

Процитировано

On the Effect of Anticipation on Reading Times DOI

Tiago Pimentel, Clara Meister,

Ethan G. Wilcox

и другие.

Transactions of the Association for Computational Linguistics, Год журнала: 2023, Номер 11, С. 1624 - 1642

Опубликована: Янв. 1, 2023

Abstract Over the past two decades, numerous studies have demonstrated how less-predictable (i.e., higher surprisal) words take more time to read. In general, these implicitly assumed reading process is purely responsive: Readers observe a new word and allocate it as required. We argue that prior results are also compatible with at least partially anticipatory: could make predictions about future based on their expectation. this work, we operationalize anticipation word’s contextual entropy. assess effect of by comparing well surprisal entropy predict times four naturalistic datasets: self-paced eye-tracking. Experimentally, across datasets analyses, find substantial evidence for effects over (RT): fact, sometimes better than in predicting RT. Spillover effects, however, generally not captured entropy, but only surprisal. Further, hypothesize cognitive mechanisms through which impact RTs—three able design experiments analyze. Overall, our support view just responsive, anticipatory.1

Язык: Английский

Процитировано