Diagnosis of pathological speech with streamlined features for long short-term memory learning
Tuan D. Pham,
No information about this author
Simon Holmes,
No information about this author
Lifong Zou
No information about this author
et al.
Computers in Biology and Medicine,
Journal Year:
2024,
Volume and Issue:
170, P. 107976 - 107976
Published: Jan. 8, 2024
Pathological
speech
diagnosis
is
crucial
for
identifying
and
treating
various
disorders.
Accurate
aids
in
developing
targeted
intervention
strategies,
improving
patients'
communication
abilities,
enhancing
their
overall
quality
of
life.
With
the
rising
incidence
speech-related
conditions
globally,
including
oral
health,
need
efficient
reliable
diagnostic
tools
has
become
paramount,
emphasizing
significance
advanced
research
this
field.
Language: Английский
Bio-inspired optimization of feature selection and SVM tuning for voice disorders detection
Knowledge-Based Systems,
Journal Year:
2025,
Volume and Issue:
unknown, P. 112950 - 112950
Published: Jan. 1, 2025
Language: Английский
Pathological Voice Classification Using MEEL Features and SVM-Tabnet Model
Speech Communication,
Journal Year:
2024,
Volume and Issue:
162, P. 103100 - 103100
Published: July 1, 2024
Language: Английский
Smartphone-derived multidomain features including voice, finger-tapping movement and gait aid early identification of Parkinson’s disease
npj Parkinson s Disease,
Journal Year:
2025,
Volume and Issue:
11(1)
Published: May 5, 2025
Language: Английский
Automatic cross‐ and multi‐lingual recognition of dysphonia by ensemble classification using deep speaker embedding models
Expert Systems,
Journal Year:
2024,
Volume and Issue:
41(10)
Published: June 12, 2024
Abstract
Machine
Learning
(ML)
algorithms
have
demonstrated
remarkable
performance
in
dysphonia
detection
using
speech
samples.
However,
their
efficacy
often
diminishes
when
tested
on
languages
different
from
the
training
data,
raising
questions
about
suitability
clinical
settings.
This
study
aims
to
develop
a
robust
method
for
cross‐
and
multi‐lingual
that
overcomes
limitation
of
language
dependency
existing
ML
methods.
We
propose
an
innovative
approach
leverages
embeddings
speaker
verification
models,
especially
ECAPA
x‐vector
employs
majority
voting
ensemble
classifier.
utilize
features
extracted
train
three
distinct
classifiers.
The
significant
advantage
these
embedding
models
lies
capability
capture
characteristics
language‐independent
manner,
forming
fixed‐dimensional
feature
spaces.
Additionally,
we
investigate
impact
generating
synthetic
data
within
space
Synthetic
Minority
Oversampling
Technique
(SMOTE).
Our
experimental
results
unveil
effectiveness
proposed
detection.
Compared
obtained
embeddings,
consistently
demonstrates
superior
distinguishing
between
healthy
dysphonic
speech,
achieving
accuracy
values
93.33%
96.55%
both
cross‐lingual
scenarios,
respectively.
highlights
capabilities
ECAPA,
capturing
enhance
overall
performance.
effectively
addresses
challenges
combined
with
classifiers,
show
potential
improving
reliability
scenarios.
Language: Английский
Beyond breathalyzers: AI-powered speech analysis for alcohol intoxication detection
Expert Systems with Applications,
Journal Year:
2024,
Volume and Issue:
262, P. 125656 - 125656
Published: Nov. 6, 2024
Language: Английский
Reverb and Noise as Real-World Effects in Speech Recognition Models: A Study and a Proposal of a Feature Set
Applied Sciences,
Journal Year:
2024,
Volume and Issue:
14(23), P. 11446 - 11446
Published: Dec. 9, 2024
Reverberation
and
background
noise
are
common
unavoidable
real-world
phenomena
that
hinder
automatic
speaker
recognition
systems,
particularly
because
these
systems
typically
trained
on
noise-free
data.
Most
models
rely
fixed
audio
feature
sets.
To
evaluate
the
dependency
of
features
reverberation
noise,
this
study
proposes
augmenting
commonly
used
mel-frequency
cepstral
coefficients
(MFCCs)
with
relative
spectral
(RASTA)
features.
The
performance
was
assessed
using
noisy
data
generated
by
applying
pink
to
DEMoS
dataset,
which
includes
56
speakers.
Verification
were
clean
MFCCs,
RASTA
features,
or
their
combination
as
inputs.
They
validated
augmented
progressively
increasing
levels.
results
indicate
MFCCs
struggle
identify
main
speaker,
while
method
has
difficulty
opposite
class.
hybrid
set,
derived
from
combination,
demonstrates
best
overall
a
compromise
between
two.
Although
MFCC
is
standard
performs
well
training
data,
it
shows
significant
tendency
misclassify
in
scenarios,
critical
limitation
for
modern
user-centric
verification
applications.
therefore,
proves
effective
balanced
solution,
optimizing
both
sensitivity
specificity.
Language: Английский
Optimized Hybrid Model for Enhanced Parkinson’s Disease Classification Using Feature Fused Voice Signal
S. Sharanyaa,
No information about this author
M Sambath
No information about this author
International Journal of Electronics and Communication Engineering,
Journal Year:
2023,
Volume and Issue:
10(11), P. 11 - 26
Published: Nov. 30, 2023
Parkinson’s
Disease
(PD)
is
a
common
neuro
disorder
that
leads
to
reduced
nerve
function
in
the
brain
as
result
of
decreased
dopamine
generation.
The
disease
progressive,
and
patients
may
have
difficulty
speaking,
resulting
speech
variations.
Hence,
it
essential
detect
at
an
early
stage,
through
proper
diagnosis,
effect
can
be
controlled.
This
work
aims
classify
PD
based
on
vocal
feature
set
using
hybrid
CNN-ALSTM
model.
model
trained
with
Spectral,
Acoustic,
Mel-Spectrogram
features
obtained
from
de-noised
voice
signals.
proposed
involves
four
phases.
In
first
phase,
signals
are
extracted
input
data,
de-noising
done
Improved
Optimized
Variational
Mode
Decomposition
(IO-VMD).
second
Mel-Spectrograms
generated
pre-processed
where
deep
Custom
CNN,
EfficientNetB0,
Inceptionv3
models.
third
metaheuristic
Squirrel
Search
Water
Cycle
Algorithm
(SSWA)
applied
vectors,
SSWA
used
for
selection
hyper
parameter
tuning.
Finally,
spectral
acoustic
concatenated
mel
spectrogram
trained,
classified
Attention
Long
Short
Term
Memory
(ALSTM)
A
comparative
analysis
models
like
CNN-ALSTM,
Inceptionv3-
ALSTM,
EfficientNetB0-ALSTM
carried
out
PD.
From
analysis,
algorithm
achieves
accuracy
96.8%
performs
better
than
other
Language: Английский