Spatio-temporal Weber Gradient Directional feature for visual and audio-visual phrase recognition systems DOI
Salam Nandakishor,

Debadatta Pati

International Journal of Information Technology, Journal Year: 2024, Volume and Issue: unknown

Published: Aug. 12, 2024

Language: Английский

Reverb and Noise as Real-World Effects in Speech Recognition Models: A Study and a Proposal of a Feature Set DOI Creative Commons
Valerio Cesarini, Giovanni Costantini

Applied Sciences, Journal Year: 2024, Volume and Issue: 14(23), P. 11446 - 11446

Published: Dec. 9, 2024

Reverberation and background noise are common unavoidable real-world phenomena that hinder automatic speaker recognition systems, particularly because these systems typically trained on noise-free data. Most models rely fixed audio feature sets. To evaluate the dependency of features reverberation noise, this study proposes augmenting commonly used mel-frequency cepstral coefficients (MFCCs) with relative spectral (RASTA) features. The performance was assessed using noisy data generated by applying pink to DEMoS dataset, which includes 56 speakers. Verification were clean MFCCs, RASTA features, or their combination as inputs. They validated augmented progressively increasing levels. results indicate MFCCs struggle identify main speaker, while method has difficulty opposite class. hybrid set, derived from combination, demonstrates best overall a compromise between two. Although MFCC is standard performs well training data, it shows significant tendency misclassify in scenarios, critical limitation for modern user-centric verification applications. therefore, proves effective balanced solution, optimizing both sensitivity specificity.

Language: Английский

Citations

1

Spatio-temporal Weber Gradient Directional feature for visual and audio-visual phrase recognition systems DOI
Salam Nandakishor,

Debadatta Pati

International Journal of Information Technology, Journal Year: 2024, Volume and Issue: unknown

Published: Aug. 12, 2024

Language: Английский

Citations

0