Cited by Shared functional specialization in transformer-based language models and the human brain

Shared computational principles for language processing in humans and deep language models DOI

Ariel Goldstein, Zaid Zada,

Eliav Buchnik

et al.

Nature Neuroscience, Journal Year: 2022, Volume and Issue: 25(3), P. 369 - 380

Published: March 1, 2022

Departing from traditional linguistic models, advances in deep learning have resulted a new type of predictive (autoregressive) language models (DLMs). Using self-supervised next-word prediction task, these generate appropriate responses given context. In the current study, nine participants listened to 30-min podcast while their brain were recorded using electrocorticography (ECoG). We provide empirical evidence that human and autoregressive DLMs share three fundamental computational principles as they process same natural narrative: (1) both are engaged continuous before word onset; (2) match pre-onset predictions incoming calculate post-onset surprise; (3) rely on contextual embeddings represent words contexts. Together, our findings suggest biologically feasible framework for studying neural basis language.

Language: Английский

Citations

298

A hierarchy of linguistic predictions during natural language comprehension DOI

Micha Heilbron, Kristijan Armeni, Jan‐Mathijs Schoffelen

et al.

Proceedings of the National Academy of Sciences, Journal Year: 2022, Volume and Issue: 119(32)

Published: Aug. 3, 2022

Understanding spoken language requires transforming ambiguous acoustic streams into a hierarchy of representations, from phonemes to meaning. It has been suggested that the brain uses prediction guide interpretation incoming input. However, role in processing remains disputed, with disagreement about both ubiquity and representational nature predictions. Here, we address issues by analyzing recordings participants listening audiobooks, using deep neural network (GPT-2) precisely quantify contextual First, establish responses words are modulated ubiquitous Next, disentangle model-based predictions distinct dimensions, revealing dissociable signatures syntactic category (parts speech), phonemes, semantics. Finally, show high-level (word) inform low-level (phoneme) predictions, supporting hierarchical predictive processing. Together, these results underscore processing, showing spontaneously predicts upcoming at multiple levels abstraction.

Language: Английский

Citations

233

Brains and algorithms partially converge in natural language processing DOI

Charlotte Caucheteux, Jean-Rémi King

Communications Biology, Journal Year: 2022, Volume and Issue: 5(1)

Published: Feb. 16, 2022

Deep learning algorithms trained to predict masked words from large amount of text have recently been shown generate activations similar those the human brain. However, what drives this similarity remains currently unknown. Here, we systematically compare a variety deep language models identify computational principles that lead them brain-like representations sentences. Specifically, analyze brain responses 400 isolated sentences in cohort 102 subjects, each recorded for two hours with functional magnetic resonance imaging (fMRI) and magnetoencephalography (MEG). We then test where when these maps onto responses. Finally, estimate how architecture, training, performance independently account generation representations. Our analyses reveal main findings. First, between primarily depends on their ability context. Second, reveals rise maintenance perceptual, lexical, compositional within cortical region. Overall, study shows modern partially converge towards solutions, thus delineates promising path unravel foundations natural processing.

Language: Английский

Citations

213

Evidence of a predictive coding hierarchy in the human brain listening to speech DOI

Charlotte Caucheteux, Alexandre Gramfort, Jean-Rémi King

et al.

Nature Human Behaviour, Journal Year: 2023, Volume and Issue: 7(3), P. 430 - 441

Published: March 2, 2023

Abstract Considerable progress has recently been made in natural language processing: deep learning algorithms are increasingly able to generate, summarize, translate and classify texts. Yet, these models still fail match the abilities of humans. Predictive coding theory offers a tentative explanation this discrepancy: while optimized predict nearby words, human brain would continuously hierarchy representations that spans multiple timescales. To test hypothesis, we analysed functional magnetic resonance imaging signals 304 participants listening short stories. First, confirmed activations modern linearly map onto responses speech. Second, showed enhancing with predictions span timescales improves mapping. Finally, organized hierarchically: frontoparietal cortices higher-level, longer-range more contextual than temporal cortices. Overall, results strengthen role hierarchical predictive processing illustrate how synergy between neuroscience artificial intelligence can unravel computational bases cognition.

Language: Английский

Citations

160

The neuroconnectionist research programme DOI

Adrien Doerig,

Rowan P. Sommers,

Katja Seeliger

et al.

Nature reviews. Neuroscience, Journal Year: 2023, Volume and Issue: 24(7), P. 431 - 450

Published: May 30, 2023

Language: Английский

Citations

136

Dissociating language and thought in large language models DOI

Kyle Mahowald, Anna A. Ivanova, Idan Blank

et al.

Trends in Cognitive Sciences, Journal Year: 2024, Volume and Issue: 28(6), P. 517 - 540

Published: March 19, 2024

Language: Английский

Citations

130

The language network as a natural kind within the broader landscape of the human brain DOI

Evelina Fedorenko, Anna A. Ivanova, Tamar I. Regev

et al.

Nature reviews. Neuroscience, Journal Year: 2024, Volume and Issue: 25(5), P. 289 - 312

Published: April 12, 2024

Language: Английский

Citations

Do Large Language Models Know What Humans Know? DOI

Sean Trott, Cameron R. Jones, Tyler H. Chang

et al.

Cognitive Science, Journal Year: 2023, Volume and Issue: 47(7)

Published: July 1, 2023

Humans can attribute beliefs to others. However, it is unknown what extent this ability results from an innate biological endowment or experience accrued through child development, particularly exposure language describing others' mental states. We test the viability of hypothesis by assessing whether models exposed large quantities human display sensitivity implied knowledge states characters in written passages. In pre-registered analyses, we present a linguistic version False Belief Task both participants and model, GPT-3. Both are sensitive beliefs, but while model significantly exceeds chance behavior, does not perform as well humans nor explain full their behavior-despite being more than would lifetime. This suggests that statistical learning may part how develop reason about others, other mechanisms also responsible.

Language: Английский

Citations

Language is primarily a tool for communication rather than thought DOI

Evelina Fedorenko, Steven T. Piantadosi,

Edward Gibson

et al.

Nature, Journal Year: 2024, Volume and Issue: 630(8017), P. 575 - 586

Published: June 19, 2024

Language: Английский

Citations

Driving and suppressing the human language network using large language models DOI

Greta Tuckute, Aalok Sathe, Shashank Srikant

et al.

Nature Human Behaviour, Journal Year: 2024, Volume and Issue: 8(3), P. 544 - 561

Published: Jan. 3, 2024

Language: Английский

Citations