DCEPNet: Dual-Channel Emotional Perception Network for Speech Emotion Recognition DOI
Fei Xiang, Hongbo Liu, Ruili Wang

et al.

Published: Dec. 3, 2024

Language: Английский

Speech Emotion Recognition Using Generative Adversarial Network and Deep Convolutional Neural Network DOI
Kishor Bhangale, Mohanaprasad Kothandaraman

Circuits Systems and Signal Processing, Journal Year: 2023, Volume and Issue: 43(4), P. 2341 - 2384

Published: Dec. 16, 2023

Language: Английский

Citations

10

A Comparative Analysis of Constant-Q Transform, Gammatonegram, and Mel-Spectrogram Techniques for AI-aided Cardiac Diagnostics DOI
Mohammed Saddek Mekahlia, Mohamed Fezari, Ahcen Aliouat

et al.

Medical Engineering & Physics, Journal Year: 2025, Volume and Issue: 137, P. 104302 - 104302

Published: Feb. 6, 2025

Language: Английский

Citations

0

Multimodal Emotion Recognition based on Face and Speech using Deep Convolution Neural Network and Long Short Term Memory DOI
Shwetkranti Taware, Anuradha Thakare

Circuits Systems and Signal Processing, Journal Year: 2025, Volume and Issue: unknown

Published: April 25, 2025

Language: Английский

Citations

0

Speech Emotion Recognition using Mel Spectrogram and Convolutional Neural Networks (CNN) DOI Open Access

Vidhi Sareen,

K. R. Seeja

Procedia Computer Science, Journal Year: 2025, Volume and Issue: 258, P. 3693 - 3702

Published: Jan. 1, 2025

Language: Английский

Citations

0

A novel two-way feature extraction technique using multiple acoustic and wavelets packets for deep learning based speech emotion recognition DOI
Kishor Bhangale, Mohanaprasad Kothandaraman

Multimedia Tools and Applications, Journal Year: 2024, Volume and Issue: unknown

Published: June 17, 2024

Language: Английский

Citations

2

BSER: A Learning Framework for Bangla Speech Emotion Recognition DOI
Md. Mahadi Hassan, M. Raihan, Md. Mehedi Hassan

et al.

Published: May 2, 2024

Human Computer Interaction (HCI) relies on accurate speech emotion identification. Speech Emotion Recognition (SER) analyzes voice signals to classify emotions. English based has been extensively studied, while Bangla SER not. The study integrates a one-dimensional convolution neural network with long short-term memory (LSTM) architecture into fully linked for SER. categorization requires feature inclusion, which this method achieves. We included Additive White Gaussian Noise (AWGN), signal elongation, and pitch alteration improve dataset dependability. Mel-frequency cepstral coefficients (MFCC), Mel-Spectrogram, Zero Crossing Rate (ZCR), chromagram, Root Mean Square Error are analyzed in study. One-dimensional convolutional blocks extract local information, LSTM layers catch global trends our model. Training testing loss curves, confusion matrix, recall, precision, F1-score, accuracy used evaluate the assessed using two cutting-edge datasets, SUST Emotional Corpus (SUBESCO) Ryerson Audio-Visual Database of Song (RAVDESS). Experimental results show that suggested BSER model is more resilient than baseline models both datasets. improves research sector shows hybrid can detect emotions inputs.

Language: Английский

Citations

1

A Feature-Reduction Scheme Based on a Two-Sample t-Test to Eliminate Useless Spectrogram Frequency Bands in Acoustic Event Detection Systems DOI Open Access
Vahid Hajihashemi, Abdorreza Alavi Gharahbagh, Narges Hajaboutalebi

et al.

Electronics, Journal Year: 2024, Volume and Issue: 13(11), P. 2064 - 2064

Published: May 25, 2024

Acoustic event detection (AED) systems, combined with video surveillance can enhance urban security and safety by automatically detecting incidents, supporting the smart city concept. AED systems mostly use mel spectrograms as a well-known effective acoustic feature. The spectrogram is combination of frequency bands. A big challenge that some bands may be similar in different events useless AED. Removing reduces input feature dimension highly desirable. This article proposes mathematical analysis method to identify eliminate ineffective improve systems’ efficiency. proposed approach uses Student’s t-test compare from events. similarity between each band among calculated using two-sample t-test, allowing identification distinct these accelerates training speed used classifier reducing number features, also enhances system’s accuracy Based on obtained results, 26.3%. results showed an average difference 7.77% Jaccard, 4.07% Dice, 5.7% Hamming distance selected train test datasets. These small values underscore validity for dataset.

Language: Английский

Citations

1

Indian Cross Corpus Speech Emotion Recognition Using Multiple Spectral-Temporal-Voice Quality Acoustic Features and Deep Convolution Neural Network DOI Creative Commons
Rupali Kawade,

Sonal K. Jagtap

Revue d intelligence artificielle, Journal Year: 2024, Volume and Issue: 38(3), P. 913 - 927

Published: June 21, 2024

Speech Emotion Recognition (SER) is very crucial in enriching next generation human machine interaction (HMI) with emotional intelligence capabilities by extracting the emotions from words and voice.However, current SER techniques are developed within experimental boundaries faces major challenges such as lack of robustness across languages, cultures, age gaps gender speakers.Very little work carried out for Indian corpus which has higher diversity, large number dialects, vast changes due to regional geographical aspects.India one largest customers HMI systems, social networking sites internet users, therefore it that focuses on corpuses.This paper presents, cross (CCSER) using multiple acoustic features (MAF) deep convolution neural network (DCNN) improve SER.The MAF consists various spectral, temporal voice quality features.Further, Fire Hawk based optimization (FHO) technique utilized salient feature selection.The FHO selects important minimize computational complexity distinctiveness inter class variance features.The DCNN algorithm provides better correlation, representation, description variation timbre, intonation pitch, superior connectivity global local speech signal characterize corpus.The outcomes suggested evaluated Indo-Aryan language family (Hindi Urdu) Dravidian Language (Telugu Kannada).The proposed scheme results improved accuracy multilingual performs traditional techniques.It an 58.83%, 61.75%, 69.75% 45.51% Hindi, Urdu, Telugu Kannada training.

Language: Английский

Citations

1

Speech emotion recognition using the novel SwinEmoNet (Shifted Window Transformer Emotion Network) DOI

R. Ramesh,

Viswanathan Balasubramanian Prahaladhan,

P Nithish

et al.

International Journal of Speech Technology, Journal Year: 2024, Volume and Issue: 27(3), P. 551 - 568

Published: July 10, 2024

Language: Английский

Citations

1

Machine Learning Based Heart Disease Prediction Using ECG Image DOI
Kishor Bhangale, Saurabh Kadam,

Sakshi Chame

et al.

Published: May 24, 2024

Language: Английский

Citations

1