Analysis of Deep Learning Models for Voice Pathology Detection DOI

Adham Ahmed Said,

Ahmad Khaled Mohammed,

Mohammad Essam

et al.

Published: Nov. 21, 2023

Voice disorders affect a significant portion of the global population, particularly those in vocally demanding professions such as singers, actors, teachers, and lawyers. Early detection diagnosis voice pathology diseases are critical to improving treatment outcomes preventing further damage vocal cords. Digital processing speech signals has emerged promising technique for analyzing vibrations identifying deformities cord function. In this paper, cost-effective computational method involves signal by passing stack band-pass filters, dividing processed each filter set overlapped frames, applying autocorrelation formula every single frame, using entropy extract features. The shown promise reliably detecting classifying diseases, but research is required confirm its efficacy reliability. Deep learning algorithms Mel spectrogram feature extraction techniques present paper detection. VGG16, VGG19, ResNet50 compared. system demonstrated high prediction accuracy results on training testing dataset. shows potential clinical applications disorder assessment diagnosis. also holds telemedicine tool, enabling remote monitoring patients' health.

Language: Английский

Enhancing Voice Disorder Detection Using Deep Transfer Learning Feature Fusion DOI
Roohum Jegan,

R. Jayagowri

Published: March 14, 2024

This paper introduces a computerized non-invasive voice pathology detection system using deep transfer learning network (DTLN) feature fusion. The takes both healthy and pathological samples as input converts them into mel-spectrogram visual representations. Subsequently, it employs three architectures, namely (a) AlexNet, (b) ResNet-50, (c) Inception-V3, to extract complex features from the signal's spectrograms. As vector dimensions grow due aggregation of these CNN models, study an infinite selection algorithm identify most distinguishing features. These selected optimal are then used classify speech either or pathological, utilizing K-nearest neighbor (KNN) classifier. effectiveness this method is evaluated on well-established datasets, AVPD, SVD, PdA, metrics such precision, specificity, sensitivity, F-measure, accuracy. experimental results reveal that proposed fusion approach achieves accuracy rates 97.86%, 95%, 96.83% for PdA respectively.

Language: Английский

Citations

1

Multi-label voice disorder classification using raw waveforms DOI Creative Commons
Gökay Dişken

TURKISH JOURNAL OF ELECTRICAL ENGINEERING & COMPUTER SCIENCES, Journal Year: 2024, Volume and Issue: 32(4), P. 590 - 604

Published: July 26, 2024

Automated voice disorder systems that distinguish pathological voices from healthy ones have been developed with the aid of machine learning methods. Both clinicians and patients can benefit these as they provide many advantages, compared to invasive techniques. These produce binary (healthy/pathological) or multi-class (healthy/selected pathologies) decisions. However, multiple disorders might exist in an individual's voice. Multi-label classification should be considered such cases. By this time, only a single report is available on topic, where hand-crafted features were used, data augmentation technique was utilized overcome class imbalances. In study, similar experimental setup followed investigate suitability raw signals inputs for multi-label classification. A deep model which consists residual blocks novel gating mechanism proposed. The weighs channels block's output based both its previous layer's output. Using SincNet filterbank operates directly waveform initial layer, 0.99 accuracy 0.98 F1 score observed natural /a/ vowels Saarbruecken Voice Database time domain balance samples. On other hand, reducing number augmented samples decreased performance systems, indicating need balanced dataset avoid oversampling underrepresented classes. proposed architecture performed consistently better than ResNet18 connected attention, verified effectiveness mechanism.

Language: Английский

Citations

1

Pathological voice detection using optimized deep residual neural network and explainable artificial intelligence DOI
Roohum Jegan,

R. Jayagowri

Multimedia Tools and Applications, Journal Year: 2024, Volume and Issue: unknown

Published: Oct. 8, 2024

Language: Английский

Citations

1

Advances in Automated Voice Pathology Detection: A Comprehensive Review of Speech Signal Analysis Techniques DOI Creative Commons
Anitha Sankaran, Lakshmi Sutha Kumar

IEEE Access, Journal Year: 2024, Volume and Issue: 12, P. 181127 - 181148

Published: Jan. 1, 2024

Language: Английский

Citations

1

Center-bridged Interaction Fusion for hyperspectral and LiDAR classification DOI

Lu Huo,

Jiahao Xia, Leijie Zhang

et al.

Neurocomputing, Journal Year: 2024, Volume and Issue: 590, P. 127757 - 127757

Published: April 25, 2024

Language: Английский

Citations

1

Patho VoiceAI: Classifying Pathology Types in Human Voices DOI

Srinidhi Kanagachalam,

Deok‐Hwan Kim

Published: July 2, 2024

Language: Английский

Citations

1

MBIAN: Multi-level bilateral interactive attention network for multi-modal image processing DOI
Kai Sun, Jiangshe Zhang, Jialin Wang

et al.

Expert Systems with Applications, Journal Year: 2023, Volume and Issue: 231, P. 120733 - 120733

Published: June 10, 2023

Language: Английский

Citations

1

AROA based Pre-trained Model of Convolutional Neural Network for Voice Pathology Detection and Classification DOI Creative Commons

J Manikandan,

K. Kayalvizhi,

Yuvaraj Nachimuthu

et al.

Journal of Machine and Computing, Journal Year: 2024, Volume and Issue: unknown, P. 463 - 471

Published: April 5, 2024

With the demand for better, more user-friendly HMIs, voice recognition systems have risen in prominence recent years. The use of computer-assisted vocal pathology categorization tools allows accurate detection diseases. By using these methods, disorders may be diagnosed early on and treated accordingly. An effective Deep Learning-based tool feature extraction-based identification is goal this project. This research presents results EfficientNet, a pre-trained Convolutional Neural Network (CNN), speech dataset order to achieve highest possible classification accuracy. Artificial Rabbit Optimization Algorithm (AROA)-tuned set parameters complements model's mobNet building elements, which include linear stack divisible convolution max-pooling layers activated by Swish. In make suggested approach applicable broad variety disorder problems, study also suggests unique training method along with several methodologies. One database, Saarbrücken database (SVD), has been used test proposed technology. Using up 96% accuracy, experimental findings demonstrate that CNN capable detecting pathologies. demonstrates great potential real-world clinical settings, where it provide classifications as little three seconds expedite automated diagnosis treatment.

Language: Английский

Citations

0

Voice Pathology Detection Based on Canonical Correlation Analysis Method Using Hilbert–Huang Transform and LSTM Features DOI
Mehmet Bilal Er, Nagehan İlhan

Arabian Journal for Science and Engineering, Journal Year: 2024, Volume and Issue: unknown

Published: Sept. 26, 2024

Language: Английский

Citations

0

Optimized early fusion of handcrafted and deep learning descriptors for voice pathology detection and classification DOI Creative Commons
Roohum Jegan,

R. Jayagowri

Healthcare Analytics, Journal Year: 2024, Volume and Issue: unknown, P. 100369 - 100369

Published: Nov. 1, 2024

Language: Английский

Citations

0