Опубликована: Янв. 1, 2024
Язык: Английский
Опубликована: Янв. 1, 2024
Язык: Английский
Circuits Systems and Signal Processing, Год журнала: 2024, Номер 43(5), С. 3261 - 3278
Опубликована: Фев. 22, 2024
Язык: Английский
Процитировано
4Опубликована: Янв. 18, 2024
Globally, valvular heart diseases (VHDs) account for a major portion of deaths and illnesses. An accurate timely identification VHDs is essential directing proper treatment enhancing patient outcomes. Phonocardiogram (PCG) signals provide non-invasive affordable means capturing acoustic information about the cardiac cycle, rendering them suitable VHD detection. The proposed method provides an explainable artificial intelligence (XAI) framework PCG-based diagnosis using convolutional neural network (CNN) - long short-term memory (LSTM) (CNN-LSTM) network. leverages strengths deep learning to achieve high diagnostic accuracy while providing interpretability XAI model's predictions. Data augmentation techniques are utilized augment PCG signals. Mel-spectrograms used extract relevant features from model consists CNN architecture layer LSTM making CNN-LSTM architecture. will be 5-class classifier with classes named aortic stenosis, mitral regurgitation, valve prolapse, normal. technique employed gradient-weighted class activation mapping (Grad-CAM), enabling visualization decision-making by generating heatmaps. impressive 97.5% has been achieved model. integration ensures comprehensive interpretation model, transparency potential real-time clinical deployment.
Язык: Английский
Процитировано
4Speech Communication, Год журнала: 2024, Номер 159, С. 103069 - 103069
Опубликована: Апрель 1, 2024
Язык: Английский
Процитировано
4EURASIP Journal on Audio Speech and Music Processing, Год журнала: 2024, Номер 2024(1)
Опубликована: Июнь 25, 2024
Abstract Dysarthria is a speech disorder that affects the ability to communicate due articulation difficulties. This research proposes novel method for automatic dysarthria detection (ADD) and severity level assessment (ADSLA) by using variable continuous wavelet transform (CWT) layered convolutional neural network (CNN) model. To determine their efficiency, proposed model assessed two distinct corpora, TORGO UA-Speech, comprising both patients healthy subject signals. The study explores effectiveness of CWT-layered CNN models employ different wavelets such as Amor, Morse, Bump. aims analyze models’ performance without need feature extraction, which could provide deeper insights into in processing complex data. Also, raw waveform modeling preserves original signal’s integrity nuance, making it ideal applications like recognition, signal processing, image processing. Extensive analysis experimentation have revealed Amor surpasses Morse Bump accurately representing characteristics. outperforms others terms reconstruction fidelity, noise suppression capabilities, extraction accuracy. emphasizes importance selecting appropriate signal-processing tasks. reliable precise choice applications. UA-Speech dataset crucial more accurate classification. Advanced deep learning techniques can simplify early intervention measures expedite diagnosis process.
Язык: Английский
Процитировано
4Journal of Cybersecurity and Privacy, Год журнала: 2025, Номер 5(1), С. 6 - 6
Опубликована: Фев. 8, 2025
Advances in deep learning have led to dramatic improvements generative synthetic speech, eliminating robotic speech patterns create that is indistinguishable from a human voice. Although these advances are extremely useful various applications, they also facilitate powerful attacks against both humans and machines. Recently, new type of attack called partial fake (PF) has emerged. This paper studies how well machines, including speaker recognition systems existing fake-speech detection tools, can distinguish between voice computer-generated speech. Our study shows machines be easily deceived by PF the current defences insufficient. These findings emphasise urgency increasing awareness for creating automated
Язык: Английский
Процитировано
0Engineering Applications of Artificial Intelligence, Год журнала: 2025, Номер 147, С. 110314 - 110314
Опубликована: Фев. 22, 2025
Язык: Английский
Процитировано
0Multimedia Tools and Applications, Год журнала: 2025, Номер unknown
Опубликована: Март 13, 2025
Язык: Английский
Процитировано
0Communications in computer and information science, Год журнала: 2025, Номер unknown, С. 165 - 178
Опубликована: Янв. 1, 2025
Язык: Английский
Процитировано
0Опубликована: Авг. 2, 2024
Язык: Английский
Процитировано
0Advances in information security, privacy, and ethics book series, Год журнала: 2024, Номер unknown, С. 107 - 138
Опубликована: Июль 12, 2024
Spoofing attacks are a major risk for automatic speaker verification systems, which becoming more widespread. Adequate countermeasures necessary since like replay, synthetic, and deepfake attacks, difficult to identify. Technologies that can identify audio-level must be developed in order address this issue. In chapter, the authors have proposed combination of different spectrogram-based techniques with Residual Networks34 (ResNet34) securing (ASV) systems. The methodology uses Mel frequency scale-based Mel-spectrogram (MS), gamma gammatone spectrogram (GS), filter bank-based cepstral spectrograms (MCS), acoustic pattern-based pattern (APS), (GCS), short-time Fourier transform-based short (SFS) methods, one by one, at front audio spoof detection system. These individually fed ResNet34 classification backend.
Язык: Английский
Процитировано
0