Voice
disorders
affect
a
significant
portion
of
the
global
population,
particularly
those
in
vocally
demanding
professions
such
as
singers,
actors,
teachers,
and
lawyers.
Early
detection
diagnosis
voice
pathology
diseases
are
critical
to
improving
treatment
outcomes
preventing
further
damage
vocal
cords.
Digital
processing
speech
signals
has
emerged
promising
technique
for
analyzing
vibrations
identifying
deformities
cord
function.
In
this
paper,
cost-effective
computational
method
involves
signal
by
passing
stack
band-pass
filters,
dividing
processed
each
filter
set
overlapped
frames,
applying
autocorrelation
formula
every
single
frame,
using
entropy
extract
features.
The
shown
promise
reliably
detecting
classifying
diseases,
but
research
is
required
confirm
its
efficacy
reliability.
Deep
learning
algorithms
Mel
spectrogram
feature
extraction
techniques
present
paper
detection.
VGG16,
VGG19,
ResNet50
compared.
system
demonstrated
high
prediction
accuracy
results
on
training
testing
dataset.
shows
potential
clinical
applications
disorder
assessment
diagnosis.
also
holds
telemedicine
tool,
enabling
remote
monitoring
patients'
health.
Computers, materials & continua/Computers, materials & continua (Print),
Journal Year:
2024,
Volume and Issue:
80(1), P. 1 - 35
Published: Jan. 1, 2024
Multi-modal
fusion
technology
gradually
become
a
fundamental
task
in
many
fields,
such
as
autonomous
driving,
smart
healthcare,
sentiment
analysis,
and
human-computer
interaction.
It
is
rapidly
becoming
the
dominant
research
due
to
its
powerful
perception
judgment
capabilities.
Under
complex
scenes,
multi-modal
utilizes
complementary
characteristics
of
multiple
data
streams
fuse
different
types
achieve
more
accurate
predictions.
However,
achieving
outstanding
performance
challenging
because
equipment
limitations,
missing
information,
noise.
This
paper
comprehensively
reviews
existing
methods
based
on
techniques
completes
detailed
in-depth
analysis.
According
stage,
has
four
primary
methods:
early
fusion,
deep
late
hybrid
fusion.
The
surveys
three
major
technologies
that
can
significantly
enhance
effect
further
explore
applications
various
fields.
Finally,
it
discusses
challenges
explores
potential
opportunities.
tasks
still
need
intensive
study
heterogeneity
quality.
Preserving
information
eliminating
redundant
between
modalities
critical
technology.
Invalid
may
introduce
extra
noise
lead
worse
results.
provides
comprehensive
summary
response
these
challenges.
Computer Methods in Biomechanics & Biomedical Engineering,
Journal Year:
2023,
Volume and Issue:
27(14), P. 2041 - 2057
Published: Oct. 18, 2023
AbstractThis
article
proposes
a
noninvasive
computer-aided
assessment
approach
based
on
optimized
convolutional
neural
network
for
healthy
and
pathological
voice
detection.
Firstly,
the
input
samples
are
first
transformed
into
mel-spectrogram
time-frequency
visual
representations
fed
training
CNN
model.
The
image
captures
inherent
speech
variations
beneficial
sample
weights
biases
of
trained
further
using
artificial
bee
colony
(ABC)
optimization
algorithm
resulting
in
optimum
employed
testing
unseen
data.
proposed
is
evaluated
three
popular
publicly
available
datasets:
SVD,
AVPD
VOICED.
Experimental
results
emphasize
that
ABC
model
shows
improved
accuracy
performance
by
1.02%
compared
to
conventional
illustrating
data-independent
discriminative
representation
ability.
Finally,
gradient-weighted
class
activation
mapping
(Grad-CAM)
explainable
intelligence
(XAI)
utilized
make
decision
understandable.Keywords:
Voice
pathology
detectionoptimized
CNNexplainable
intelligencemel-spectrogramimage
texture
features
AcknowledgmentWe
would
like
thank
authors
SVD
Barry
WJ
(Barry
Citation2007),
(Mesallam
et
al.
Citation2017)
VOICED
(Cesari
Citation2018)
providing
database.
Also,
we
anonymous
reviewers
their
valuable
comments/suggestions.Disclosure
statementThe
declare
they
have
no
known
competing
financial
interests
or
personal
relationships
could
appeared
influence
work
reported
this
paper.Additional
informationFundingThe
author(s)
there
funding
associated
with
featured
article.