Published: Jan. 1, 2024
Language: Английский
Published: Jan. 1, 2024
Language: Английский
Published: Jan. 18, 2024
Globally, valvular heart diseases (VHDs) account for a major portion of deaths and illnesses. An accurate timely identification VHDs is essential directing proper treatment enhancing patient outcomes. Phonocardiogram (PCG) signals provide non-invasive affordable means capturing acoustic information about the cardiac cycle, rendering them suitable VHD detection. The proposed method provides an explainable artificial intelligence (XAI) framework PCG-based diagnosis using convolutional neural network (CNN) - long short-term memory (LSTM) (CNN-LSTM) network. leverages strengths deep learning to achieve high diagnostic accuracy while providing interpretability XAI model's predictions. Data augmentation techniques are utilized augment PCG signals. Mel-spectrograms used extract relevant features from model consists CNN architecture layer LSTM making CNN-LSTM architecture. will be 5-class classifier with classes named aortic stenosis, mitral regurgitation, valve prolapse, normal. technique employed gradient-weighted class activation mapping (Grad-CAM), enabling visualization decision-making by generating heatmaps. impressive 97.5% has been achieved model. integration ensures comprehensive interpretation model, transparency potential real-time clinical deployment.
Language: Английский
Citations
4Speech Communication, Journal Year: 2024, Volume and Issue: 159, P. 103069 - 103069
Published: April 1, 2024
Language: Английский
Citations
4EURASIP Journal on Audio Speech and Music Processing, Journal Year: 2024, Volume and Issue: 2024(1)
Published: June 25, 2024
Abstract Dysarthria is a speech disorder that affects the ability to communicate due articulation difficulties. This research proposes novel method for automatic dysarthria detection (ADD) and severity level assessment (ADSLA) by using variable continuous wavelet transform (CWT) layered convolutional neural network (CNN) model. To determine their efficiency, proposed model assessed two distinct corpora, TORGO UA-Speech, comprising both patients healthy subject signals. The study explores effectiveness of CWT-layered CNN models employ different wavelets such as Amor, Morse, Bump. aims analyze models’ performance without need feature extraction, which could provide deeper insights into in processing complex data. Also, raw waveform modeling preserves original signal’s integrity nuance, making it ideal applications like recognition, signal processing, image processing. Extensive analysis experimentation have revealed Amor surpasses Morse Bump accurately representing characteristics. outperforms others terms reconstruction fidelity, noise suppression capabilities, extraction accuracy. emphasizes importance selecting appropriate signal-processing tasks. reliable precise choice applications. UA-Speech dataset crucial more accurate classification. Advanced deep learning techniques can simplify early intervention measures expedite diagnosis process.
Language: Английский
Citations
4Journal of Cybersecurity and Privacy, Journal Year: 2025, Volume and Issue: 5(1), P. 6 - 6
Published: Feb. 8, 2025
Advances in deep learning have led to dramatic improvements generative synthetic speech, eliminating robotic speech patterns create that is indistinguishable from a human voice. Although these advances are extremely useful various applications, they also facilitate powerful attacks against both humans and machines. Recently, new type of attack called partial fake (PF) has emerged. This paper studies how well machines, including speaker recognition systems existing fake-speech detection tools, can distinguish between voice computer-generated speech. Our study shows machines be easily deceived by PF the current defences insufficient. These findings emphasise urgency increasing awareness for creating automated
Language: Английский
Citations
0Engineering Applications of Artificial Intelligence, Journal Year: 2025, Volume and Issue: 147, P. 110314 - 110314
Published: Feb. 22, 2025
Language: Английский
Citations
0Multimedia Tools and Applications, Journal Year: 2025, Volume and Issue: unknown
Published: March 13, 2025
Language: Английский
Citations
0Circuits Systems and Signal Processing, Journal Year: 2024, Volume and Issue: 43(5), P. 3261 - 3278
Published: Feb. 22, 2024
Language: Английский
Citations
3International Journal of Speech Technology, Journal Year: 2024, Volume and Issue: 27(3), P. 701 - 716
Published: July 26, 2024
Language: Английский
Citations
0Published: Aug. 2, 2024
Language: Английский
Citations
0Advances in information security, privacy, and ethics book series, Journal Year: 2024, Volume and Issue: unknown, P. 107 - 138
Published: July 12, 2024
Spoofing attacks are a major risk for automatic speaker verification systems, which becoming more widespread. Adequate countermeasures necessary since like replay, synthetic, and deepfake attacks, difficult to identify. Technologies that can identify audio-level must be developed in order address this issue. In chapter, the authors have proposed combination of different spectrogram-based techniques with Residual Networks34 (ResNet34) securing (ASV) systems. The methodology uses Mel frequency scale-based Mel-spectrogram (MS), gamma gammatone spectrogram (GS), filter bank-based cepstral spectrograms (MCS), acoustic pattern-based pattern (APS), (GCS), short-time Fourier transform-based short (SFS) methods, one by one, at front audio spoof detection system. These individually fed ResNet34 classification backend.
Language: Английский
Citations
0