A Psycholinguistics-inspired Method to Counter IP Theft Using Fake Documents DOI Open Access
Natalia Denisenko, Youzhi Zhang, Chiara Pulice

et al.

ACM Transactions on Management Information Systems, Journal Year: 2024, Volume and Issue: 15(2), P. 1 - 25

Published: March 6, 2024

Intellectual property (IP) theft is a growing problem. We build on prior work to deter IP by generating n fake versions of technical document so thief has expend time and effort in identifying the correct document. Our new SbFAKE framework proposes, for first time, novel combination language processing, optimization, psycholinguistic concept surprisal generate set such fakes. start combining psycholinguistic-based scores optimization two bilevel problems (an Explicit one simpler Implicit one) whose solutions correspond directly desired As are usually hard solve, we then show that these can each be reduced equivalent surprisal-based linear programs. performed detailed parameter tuning experiments identified best parameters algorithms. tested variants (with their settings) against performing field. able more effectively convincing fakes than past work. In addition, replacing words an original with having similar generates greater levels deception.

Language: Английский

Expectation violations signal goals in novel human communication DOI Creative Commons

Tatia Buidze,

Tobias Sommer, Ke Zhao

et al.

Nature Communications, Journal Year: 2025, Volume and Issue: 16(1)

Published: Feb. 26, 2025

Abstract Communication, often grounded in shared expectations, faces challenges when a Sender and Receiver lack common linguistic background. Our study explores how people instinctively turn to the fundamental principles of physical world overcome such barriers. Specifically, through an experimental game which Senders convey messages via trajectories, we investigate they develop novel strategies without relying on cues. We build computational model based principle expectancy violations set universal priors derived from movement kinetics. The replicates participant-designed with high accuracy shows its core variable—surprise—predicts Receiver’s physiological neuronal responses brain areas processing expectation violations. This work highlights adaptability human communication, showing surprise can be powerful tool forming new communicative language.

Language: Английский

Citations

2

Prediction during language comprehension: what is next? DOI Creative Commons
Rachel Ryskin, Mante S. Nieuwland

Trends in Cognitive Sciences, Journal Year: 2023, Volume and Issue: 27(11), P. 1032 - 1052

Published: Sept. 11, 2023

Prediction is often regarded as an integral aspect of incremental language comprehension, but little known about the cognitive architectures and mechanisms that support it. We review studies showing listeners readers use all manner contextual information to generate multifaceted predictions upcoming input. The nature these may vary between individuals owing differences in experience, among other factors. then turn unresolved questions which guide search for underlying mechanisms. (i) Is prediction essential processing or optional strategy? (ii) Are generated from within system by domain-general processes? (iii) What relationship memory? (iv) Does comprehension require simulation via production system? discuss promising directions making progress answering developing a mechanistic understanding language.

Language: Английский

Citations

41

Application of machine learning strategies in screening transition metal oxide based ozonation catalysts for BAA degradation DOI

Zhao-Gang Ding,

Sheng Liu, Xinxin Lv

et al.

Journal of Water Process Engineering, Journal Year: 2025, Volume and Issue: 71, P. 107411 - 107411

Published: March 1, 2025

Language: Английский

Citations

1

The Plausibility of Sampling as an Algorithmic Theory of Sentence Processing DOI Creative Commons
Jacob Louis Hoover, Morgan Sonderegger, Steven T. Piantadosi

et al.

Open Mind, Journal Year: 2023, Volume and Issue: unknown, P. 1 - 42

Published: June 1, 2023

Words that are more surprising given context take longer to process. However, no incremental parsing algorithm has been shown directly predict this phenomenon. In work, we focus on a class of algorithms whose runtime does naturally scale in surprisal-those involve repeatedly sampling from the prior. Our first contribution is show simple examples such increase superlinearly with surprisal, and also variance increase. These two predictions stand contrast literature surprisal theory (Hale, 2001; Levy, 2008a) which assumes expected processing cost increases linearly makes prediction about variance. second part paper, conduct an empirical study relationship between reading time, using collection modern language models estimate surprisal. We find better models, time increases. results consistent sampling-based algorithms.

Language: Английский

Citations

16

Driving and suppressing the human language network using large language models DOI Creative Commons
Greta Tuckute, Aalok Sathe, Shashank Srikant

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2023, Volume and Issue: unknown

Published: April 16, 2023

Transformer models such as GPT generate human-like language and are highly predictive of human brain responses to language. Here, using fMRI-measured 1,000 diverse sentences, we first show that a GPT-based encoding model can predict the magnitude response associated with each sentence. Then, use identify new sentences predicted drive or suppress in network. We these model-selected novel indeed strongly activity areas individuals. A systematic analysis reveals surprisal well-formedness linguistic input key determinants strength These results establish ability neural network not only mimic but also noninvasively control higher-level cortical areas, like

Language: Английский

Citations

15

Word Frequency and Predictability Dissociate in Naturalistic Reading DOI Creative Commons
Cory Shain

Open Mind, Journal Year: 2024, Volume and Issue: 8, P. 177 - 201

Published: Jan. 1, 2024

Abstract Many studies of human language processing have shown that readers slow down at less frequent or predictable words, but there is debate about whether frequency and predictability effects reflect separable cognitive phenomena: are operations retrieve words from the mental lexicon based on sensory cues distinct those predict upcoming context? Previous evidence for a frequency-predictability dissociation mostly small samples (both estimating testing their behavior), artificial materials (e.g., isolated constructed sentences), implausible modeling assumptions (discrete-time dynamics, linearity, additivity, constant variance, invariance over time), which raises question: do dissociate in ordinary comprehension, such as story reading? This study leverages recent progress open data computational to address this question scale. A large collection naturalistic reading (six datasets, >2.2 M datapoints) analyzed using nonlinear continuous-time regression, estimated statistical models trained more than currently typical psycholinguistics. Despite use data, strong estimates, flexible regression models, results converge with earlier experimental supporting dissociable additive effects.

Language: Английский

Citations

6

Linguistic inputs must be syntactically parsable to fully engage the language network DOI Creative Commons
Carina Kauf,

Hee So Kim,

Elizabeth J. Lee

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: June 22, 2024

Abstract Human language comprehension is remarkably robust to ill-formed inputs (e.g., word transpositions). This robustness has led some argue that syntactic parsing largely an illusion, and incremental more heuristic, shallow, semantics-based than often assumed. However, the available data are also consistent with possibility humans always perform rule-like symbolic simply deploy error correction mechanisms reconstruct when needed. We put these hypotheses a new stringent test by examining brain responses a) stimuli should pose challenge for reconstruction but allow complex meanings be built within local contexts through associative/shallow processing (sentences presented in backward order), b) grammatically well-formed semantically implausible sentences impede heuristic processing. Using novel behavioral paradigm, we demonstrate backward- indeed recovery of grammatical structure during comprehension. Critically, backward-presented elicit relatively low response areas, as measured fMRI. In contrast, areas similar magnitude naturalistic (plausible) sentences. other words, ability build structures both necessary sufficient fully engage network. Taken together, results provide strongest date support generalized reliance human on parsing. Significance statement Whether relies predominantly structural (syntactic) cues or meaning- related (semantic) remains debated. shed light this question areas’ where semantic pitted against each other, using find respond weakly composition cannot parsed syntactically—as confirmed paradigm—and they strongly sentences, like famous ‘Colorless green ideas sleep furiously’ sentence. These findings accounts suggest can foregone favor shallow

Language: Английский

Citations

4

Demystifying large language models in second language development research DOI Creative Commons
Yan Cong

Computer Speech & Language, Journal Year: 2024, Volume and Issue: 89, P. 101700 - 101700

Published: July 26, 2024

Evaluating students' textual response is a common and critical task in language research education practice. However, manual assessment can be tedious may lack consistency, posing challenges for both scientific discovery frontline teaching. Leveraging state-of-the-art large models (LLMs), we aim to define operationalize LLM-Surprisal, numeric representation of the interplay between lexical diversity syntactic complexity, empirically theoretically demonstrate its relevance automatic writing Chinese L2 (second language) learners' English development. We developed an LLM-based natural processing pipeline that automatically compute text Surprisal scores. By comparing metrics with widely used classic indices studies, extended usage computational writing. Our analyses suggested LLM-Surprisals distinguish from L1 (first writing, index development stages, predict scores provided by human professionals. This indicated dimension manifest itself as aspects The relative advantages disadvantages these approaches were discussed depth. concluded LLMs are promising tools enhance research. showcase paves way more nuanced computationally assessing understanding pipelines findings will inspire teachers, learners, researchers innovative accessible manner.

Language: Английский

Citations

4

Lexical Processing Strongly Affects Reading Times But Not Skipping During Natural Reading DOI Creative Commons
Micha Heilbron, Jorie van Haren, Peter Hagoort

et al.

Open Mind, Journal Year: 2023, Volume and Issue: 7, P. 757 - 783

Published: Jan. 1, 2023

Abstract In a typical text, readers look much longer at some words than others, even skipping many altogether. Historically, researchers explained this variation via low-level visual or oculomotor factors, but today it is primarily factors determining word’s lexical processing ease, such as how well word identity can be predicted from context discerned parafoveal preview. While the existence of these effects established in controlled experiments, relative importance prediction, preview and natural reading remains unclear. Here, we address question three large naturalistic corpora (n = 104, 1.5 million words), using deep neural networks Bayesian ideal observers to model linguistic prediction moment reading. Strikingly, neither nor was important for explaining skipping—the vast majority by simple model, just fixation position length. For times, contrast, found strong independent contributions preview, with effect sizes matching those experiments. Together, results challenge dominant models eye movements reading, instead support alternative that describe (but not times) largely autonomous identification, mostly determined information.

Language: Английский

Citations

10

Informativity enhances memory robustness against interference in sentence comprehension DOI Creative Commons
Weijie Xu, Richard Futrell

Journal of Memory and Language, Journal Year: 2025, Volume and Issue: 142, P. 104603 - 104603

Published: Jan. 18, 2025

Language: Английский

Citations

0