Detecting Obfuscated Malware Infections on Windows Using Ensemble Learning Techniques
Informatics and Automation,
Journal Year:
2025,
Volume and Issue:
24(1), P. 99 - 124
Published: Jan. 20, 2025
In
the
internet
and
smart
devices
era,
malware
detection
has
become
crucial
for
system
security.
Obfuscated
poses
significant
risks
to
various
platforms,
including
computers,
mobile
devices,
IoT
by
evading
advanced
security
solutions.
Traditional
heuristic-based
signature-based
methods
often
fail
against
these
threats.
Therefore,
a
cost-effective
was
proposed
using
memory
dump
analysis
ensemble
learning
techniques.
Utilizing
CIC-MalMem-2022
dataset,
effectiveness
of
decision
trees,
gradient-boosted
logistic
Regression,
random
forest,
LightGBM
in
identifying
obfuscated
evaluated.
The
study
demonstrated
superiority
techniques
enhancing
accuracy
robustness.
Additionally,
SHAP
(SHapley
Additive
exPlanations)
LIME
(Local
Interpretable
Model-agnostic
Explanations)
were
employed
elucidate
model
predictions,
improving
transparency
trustworthiness.
revealed
vital
features
significantly
impacting
detection,
such
as
process
services,
active
file
handles,
registry
keys,
callback
functions.
These
insights
are
refining
strategies
performance.
findings
contribute
cybersecurity
efforts
comprehensively
assessing
machine
algorithms
through
analysis.
This
paper
offers
valuable
future
research
advancements
paving
way
more
robust
effective
solutions
face
evolving
sophisticated
Language: Английский
Machine learning models and dimensionality reduction for improving the Android malware detection
PeerJ Computer Science,
Journal Year:
2024,
Volume and Issue:
10, P. e2616 - e2616
Published: Dec. 23, 2024
Today,
a
great
number
of
attack
opportunities
for
cybercriminals
arise
in
Android,
since
it
is
one
the
most
used
operating
systems
many
mobile
applications.
Hence,
very
important
to
anticipate
these
situations.
To
minimize
this
problem,
analysis
malware
search
applications
based
on
machine
learning
algorithms.
Our
work
uses
as
starting
point
features
proposed
by
DREBIN
project,
which
today
constitutes
key
reference
literature,
being
largest
public
Android
dataset
with
labeled
families.
The
authors
only
employ
support
vector
determine
whether
sample
or
not.
This
first
proposes
new
efficient
dimensionality
reduction
features,
well
application
several
supervised
algorithms
prediction
purposes.
Predictive
models
Random
Forest
are
found
achieve
promising
results.
They
can
detect
an
average
91.72%
samples,
low
false
positive
rate
0.13%,
and
using
5,000
features.
just
over
9%
total
DREBIN.
It
achieves
accuracy
99.52%,
precision
96.91%,
macro
F1-score
96.99%.
Language: Английский