Cited by Minimal background noise enhances neural speech tracking: Evidence of stochastic resonance

SafeEar: Content Privacy-Preserving Audio Deepfake Detection DOI

Xinfeng Li, Kai Li,

Yifan Zheng

et al.

Published: Dec. 2, 2024

Text-to-Speech (TTS) and Voice Conversion (VC) models have exhibited remarkable performance in generating realistic natural audio. However, their dark side, audio deepfake poses a significant threat to both society individuals. Existing countermeasures largely focus on determining the genuineness of speech based complete original recordings, which however often contain private content. This oversight may refrain detection from many applications, particularly scenarios involving sensitive information like business secrets. In this paper, we propose SafeEar, novel framework that aims detect audios without relying accessing content within. Our key idea is devise neural codec into decoupling model well separates semantic acoustic samples, only use (e.g., prosody timbre) for detection. way, no will be exposed detector. To overcome challenge identifying diverse clues, enhance our detector with real-world augmentation. Extensive experiments conducted four benchmark datasets demonstrate SafeEar's effectiveness detecting various techniques an equal error rate (EER) down 2.02%. Simultaneously, it shields five-language being deciphered by machine human auditory analysis, demonstrated word rates (WERs) all above 93.93% user study. Furthermore, constructed anti-deepfake anti-content recovery evaluation helps provide basis future research realms privacy preservation

Language: Английский

Citations

Reliability and generalizability of neural speech tracking in younger and older adults DOI

Ryan A. Panela,

Francesca Copelli,

Björn Herrmann

et al.

Neurobiology of Aging, Journal Year: 2023, Volume and Issue: 134, P. 165 - 180

Published: Nov. 21, 2023

Language: Английский

Citations

Resilience and vulnerability of neural speech tracking after hearing restoration DOI

Alessandra Federici, Marta Fantoni, Francesco Pavani

et al.

Communications Biology, Journal Year: 2025, Volume and Issue: 8(1)

Published: March 1, 2025

The role of early auditory experience in the development neural speech tracking remains an open question. To address this issue, we measured children with or without functional hearing during their first year life after was restored cochlear implants (CIs), as well controls (HC). Neural CIs is unaffected by absence perinatal experience. CI users and HC exhibit a similar magnitude at short timescales brain activity. However, delayed users, its timing depends on age restoration. Conversely, longer timescales, dampened participants using CIs, thereby accounting for comprehension deficits. These findings highlight resilience sensory processing while also demonstrating vulnerability higher-level to lack shows that phase loss affects differently. Tracking present but weaker ones, impacting comprehension.

Language: Английский

Citations

Minimal background noise enhances neural speech tracking: Evidence of stochastic resonance DOI

Björn Herrmann

Published: March 10, 2025

Neural activity in auditory cortex tracks the amplitude-onset envelope of continuous speech, but recent work counter-intuitively suggests that neural tracking increases when speech is masked by background noise, despite reduced intelligibility. Noise-related amplification could indicate stochastic resonance – response facilitation through noise supports tracking, a comprehensive account lacking. In five human electroencephalography (EEG) experiments, current study demonstrates generalized enhancement due to minimal noise. Results show a) enhanced for at very high SNRs (∼30 dB SNR) where highly intelligible; b) this independent attention; c) it generalizes across different stationary maskers, strongest 12-talker babble; and d) present headphone free-field listening, suggesting neural-tracking real-life listening. The paints clear picture enhances representation onset-envelope, contributes tracking. further highlights non-linearities induced make its use as biological marker processing challenging.

Language: Английский

Citations

Enhanced neural speech tracking through noise indicates stochastic resonance in humans DOI

Björn Herrmann

eLife, Journal Year: 2025, Volume and Issue: 13

Published: March 18, 2025

Neural activity in auditory cortex tracks the amplitude-onset envelope of continuous speech, but recent work counterintuitively suggests that neural tracking increases when speech is masked by background noise, despite reduced intelligibility. Noise-related amplification could indicate stochastic resonance – response facilitation through noise supports tracking, a comprehensive account lacking. In five human electroencephalography experiments, current study demonstrates generalized enhancement due to minimal noise. Results show (1) enhanced for at very high signal-to-noise ratios (~30 dB SNR) where highly intelligible; (2) this independent attention; (3) it generalizes across different stationary maskers, strongest 12-talker babble; and (4) present headphone free-field listening, suggesting neural-tracking real-life listening. The paints clear picture enhances representation onset-envelope, contributes tracking. further highlights non-linearities induced make its use as biological marker processing challenging.

Language: Английский

Citations

Dynamic modeling of EEG responses to natural speech reveals earlier processing of predictable words DOI

Jin Dou, Andrew J. Anderson, Aaron Steven White

et al.

PLoS Computational Biology, Journal Year: 2025, Volume and Issue: 21(4), P. e1013006 - e1013006

Published: April 28, 2025

In recent years, it has become clear that EEG indexes the comprehension of natural, narrative speech. One particularly compelling demonstration this fact can be seen by regressing responses to speech against measures how individual words in linguistically relate their preceding context. This approach produces a so-called temporal response function displays centro-parietal negativity reminiscent classic N400 component event-related potential. shortcoming previous implementations is they have typically assumed linear, time-invariant relationship between linguistic features and responses. other words, analysis assumes same shape timing for every word – only varies (linearly) terms its amplitude. present work, we relax assumption under hypothesis may processed more rapidly when are predictable. Specifically, introduce framework wherein standard linear modulated amplitude, latency, scale based on predictability current prior words. We use proposed model recorded from set participants who listened an audiobook narrated single talker, separate attended one two concurrently presented audiobooks. show expected faster evoking lower amplitude N400-like with earlier peaks effect driven both word’s own immediately word. Additional suggests finding not simply explained quickly disambiguated phonetic neighbors. As such, our study demonstrates brain natural depend predictability. By accounting these effects, also improves accuracy which neural modeled.

Language: Английский

Citations

Sound degradation type differentially affects neural indicators of cognitive workload and speech tracking DOI

Nathan Gagné,

Keelin M. Greenlaw,

Emily B. J. Coffey

et al.

Hearing Research, Journal Year: 2025, Volume and Issue: unknown, P. 109303 - 109303

Published: May 1, 2025

Language: Английский

Citations

Neural processing of speech comprehension in noise predicts individual age using fNIRS-based brain-behavior models DOI

Yi Liu,

Songjian Wang, Jing Lu

et al.

Cerebral Cortex, Journal Year: 2024, Volume and Issue: 34(5)

Published: May 1, 2024

Abstract Speech comprehension in noise depends on complex interactions between peripheral sensory and central cognitive systems. Despite having normal hearing, older adults show difficulties speech comprehension. It remains unclear whether the brain’s neural responses could indicate aging. The current study examined individual brain activation during perception different listening environments predict age. We applied functional near-infrared spectroscopy to 93 normal-hearing human (20 70 years old) a sentence task, which contained quiet condition 4 signal-to-noise ratios (SNR = 10, 5, 0, −5 dB) noisy conditions. A data-driven approach, region-based brain-age predictive modeling was adopted. observed significant behavioral decrease with age under conditions, but not condition. Brain activations SNR 10 dB successfully individual’s Moreover, we found that bilateral visual cortex, left dorsal pathway, cerebellum, right temporal–parietal junction area, homolog Wernicke’s middle temporal gyrus contributed most prediction performance. These results demonstrate of regions about sensory-motor mapping sound, especially be sensitive measures for than external behavior measures.

Language: Английский

Citations

Distinct roles of SNR, speech Intelligibility, and attentional effort on neural speech tracking in noise DOI

Xiaomin He,

Vinay Raghavan, Nima Mesgarani

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: Oct. 12, 2024

Robust neural encoding of speech in noise is influenced by several factors, including signal-to-noise ratio (SNR), intelligibility (SI), and attentional effort (AE). Yet, the interaction distinct role these factors remain unclear. In this study, fourteen native English speakers performed selective listening tasks at various SNR levels while EEG responses were recorded. Attentional performance was assessed using a repeated word detection task, inferred from subjects' gaze velocity. Results indicate that both SI enhance tracking target speech, with effects previously overlooked effort. Specifically, high SI, increasing leads to reduced effort, which turn decreases tracking. Our findings highlight importance differentiating roles SNR, AE processing advance our understanding how noisy processed auditory pathway.

Language: Английский

Citations

Neural Decoding of the Speech Envelope: Effects of Intelligibility and Spectral Degradation DOI

Alexis Deighton MacIntyre, Robert P. Carlyon, Tobias Goehring

et al.

Trends in Hearing, Journal Year: 2024, Volume and Issue: 28

Published: Jan. 1, 2024

During continuous speech perception, endogenous neural activity becomes time-locked to acoustic stimulus features, such as the amplitude envelope. This speech–brain coupling can be decoded using non-invasive brain imaging techniques, including electroencephalography (EEG). Neural decoding may provide clinical use an objective measure of encoding by brain—for example during cochlear implant listening, wherein signal is severely spectrally degraded. Yet, interplay between and linguistic factors lead top-down modulation thereby complicating audiological applications. To address this ambiguity, we assess envelope under spectral degradation with EEG in acoustically hearing listeners ( n = 38; 18–35 years old) vocoded speech. We dissociate sensory from higher-order processing employing intelligible (English) non-intelligible (Dutch) stimuli, auditory attention sustained a repeated-phrase detection task. Subject-specific group decoders were trained reconstruct held-out data, decoder significance determined via random permutation testing. Whereas reconstruction did not vary resolution, was associated better accuracy general. Results similar across subject-specific analyses, less consistent effects decoding. Permutation tests revealed possible differences statistical experimental condition. In general, while robust observed at individual level, variability within participants would most likely prevent differentiate levels intelligibility on basis.

Language: Английский

Citations