2021 IEEE/CVF International Conference on Computer Vision (ICCV),
Journal Year:
2021,
Volume and Issue:
unknown
Published: Oct. 1, 2021
Although
deep
neural
networks
(DNNs)
have
made
rapid
progress
in
recent
years,
they
are
vulnerable
adversarial
environments.
A
malicious
backdoor
could
be
embedded
a
model
by
poisoning
the
training
dataset,
whose
intention
is
to
make
infected
give
wrong
predictions
during
inference
when
specific
trigger
appears.
To
mitigate
potential
threats
of
attacks,
various
detection
and
defense
methods
been
proposed.
However,
existing
techniques
usually
require
poisoned
data
or
access
white-box
model,
which
commonly
unavailable
practice.
In
this
paper,
we
propose
black-box
(B3D)
method
identify
attacks
with
only
query
model.
We
introduce
gradient-free
optimization
algorithm
reverse-engineer
for
each
class,
helps
reveal
existence
attacks.
addition
detection,
also
simple
strategy
reliable
using
identified
backdoored
models.
Extensive
experiments
on
hundreds
DNN
models
trained
several
datasets
corroborate
effectiveness
our
under
setting
against
IEEE Communications Surveys & Tutorials,
Journal Year:
2024,
Volume and Issue:
26(3), P. 1861 - 1897
Published: Jan. 1, 2024
Due
to
the
greatly
improved
capabilities
of
devices,
massive
data,
and
increasing
concern
about
data
privacy,
Federated
Learning
(FL)
has
been
increasingly
considered
for
applications
wireless
communication
networks
(WCNs).
Wireless
FL
(WFL)
is
a
distributed
method
training
global
deep
learning
model
in
which
large
number
participants
each
train
local
on
their
datasets
then
upload
updates
central
server.
However,
general,
nonindependent
identically
(non-IID)
WCNs
raises
concerns
robustness,
as
malicious
participant
could
potentially
inject
"backdoor"
into
by
uploading
poisoned
or
models
over
WCN.
This
cause
misclassify
inputs
specific
target
class
while
behaving
normally
with
benign
inputs.
survey
provides
comprehensive
review
latest
backdoor
attacks
defense
mechanisms.
It
classifies
them
according
targets
(data
poisoning
poisoning),
attack
phase
(local
collection,
training,
aggregation),
stage
before
aggregation,
during
after
aggregation).
The
strengths
limitations
existing
strategies
mechanisms
are
analyzed
detail.
Comparisons
methods
designs
carried
out,
pointing
noteworthy
findings,
open
challenges,
potential
future
research
directions
related
security
privacy
WFL.
arXiv (Cornell University),
Journal Year:
2020,
Volume and Issue:
unknown
Published: Jan. 1, 2020
This
work
provides
the
community
with
a
timely
comprehensive
review
of
backdoor
attacks
and
countermeasures
on
deep
learning.
According
to
attacker's
capability
affected
stage
machine
learning
pipeline,
attack
surfaces
are
recognized
be
wide
then
formalized
into
six
categorizations:
code
poisoning,
outsourcing,
pretrained,
data
collection,
collaborative
post-deployment.
Accordingly,
under
each
categorization
combed.
The
categorized
four
general
classes:
blind
removal,
offline
inspection,
online
post
removal.
we
countermeasures,
compare
analyze
their
advantages
disadvantages.
We
have
also
reviewed
flip
side
attacks,
which
explored
for
i)
protecting
intellectual
property
models,
ii)
acting
as
honeypot
catch
adversarial
example
iii)
verifying
deletion
requested
by
contributor.Overall,
research
defense
is
far
behind
attack,
there
no
single
that
can
prevent
all
types
attacks.
In
some
cases,
an
attacker
intelligently
bypass
existing
defenses
adaptive
attack.
Drawing
insights
from
systematic
review,
present
key
areas
future
backdoor,
such
empirical
security
evaluations
physical
trigger
in
particular,
more
efficient
practical
solicited.
2021 IEEE/CVF International Conference on Computer Vision (ICCV),
Journal Year:
2021,
Volume and Issue:
unknown
Published: Oct. 1, 2021
Backdoor
attacks
have
been
considered
a
severe
security
threat
to
deep
learning.
Such
can
make
models
perform
abnormally
on
inputs
with
predefined
triggers
and
still
retain
state-of-the-art
performance
clean
data.
While
backdoor
thoroughly
investigated
in
the
image
domain
from
both
attackers'
defenders'
sides,
an
analysis
frequency
has
missing
thus
far.This
paper
first
revisits
existing
perspective
performs
comprehensive
analysis.
Our
results
show
that
many
current
exhibit
high-frequency
artifacts,
which
persist
across
different
datasets
resolutions.
We
further
demonstrate
these
artifacts
enable
simple
way
detect
at
detection
rate
of
98.50%
without
prior
knowledge
attack
details
target
model.
Acknowledging
previous
attacks'
weaknesses,
we
propose
practical
create
smooth
study
their
detectability.
defense
works
benefit
by
incorporating
into
design
consideration.
Moreover,
detector
tuned
over
stronger
generalize
well
unseen
weak
triggers.
In
short,
our
work
emphasizes
importance
considering
when
designing
defenses
Deep
neural
networks
(DNNs)
have
progressed
rapidly
during
the
past
decade
and
been
deployed
in
various
real-world
applications.
Meanwhile,
DNN
models
shown
to
be
vulnerable
security
privacy
attacks.
One
such
attack
that
has
attracted
a
great
deal
of
attention
recently
is
backdoor
attack.
Specifically,
adversary
poisons
target
model's
training
set
mislead
any
input
with
an
added
secret
trigger
class.
Previous
attacks
predominantly
focus
on
computer
vision
(CV)
applications,
as
image
classification.
In
this
paper,
we
perform
systematic
investigation
NLP
models,
propose
BadNL,
general
framework
including
novel
methods.
three
methods
construct
triggers,
namely
BadChar,
BadWord,
BadSentence,
basic
semantic-preserving
variants.
Our
achieve
almost
perfect
success
rate
negligible
effect
original
utility.
For
instance,
using
our
achieves
98.9%
yielding
utility
improvement
1.5%
SST-5
dataset
when
only
poisoning
3%
set.
Moreover,
conduct
user
study
prove
triggers
can
well
preserve
semantics
from
humans
perspective.
Fanchao
Qi,
Mukai
Li,
Yangyi
Chen,
Zhengyan
Zhang,
Zhiyuan
Liu,
Yasheng
Wang,
Maosong
Sun.
Proceedings
of
the
59th
Annual
Meeting
Association
for
Computational
Linguistics
and
11th
International
Joint
Conference
on
Natural
Language
Processing
(Volume
1:
Long
Papers).
2021.
Proceedings of the AAAI Conference on Artificial Intelligence,
Journal Year:
2021,
Volume and Issue:
35(2), P. 1148 - 1156
Published: May 18, 2021
Trojan
(backdoor)
attack
is
a
form
of
adversarial
on
deep
neural
networks
where
the
attacker
provides
victims
with
model
trained/retrained
malicious
data.
The
backdoor
can
be
activated
when
normal
input
stamped
certain
pattern
called
trigger,
causing
misclassification.
Many
existing
trojan
attacks
have
their
triggers
being
space
patches/objects
(e.g.,
polygon
solid
color)
or
simple
transformations
such
as
Instagram
filters.
These
are
susceptible
to
recent
detection
algorithms.
We
propose
novel
feature
five
characteristics:
effectiveness,
stealthiness,
controllability,
robustness
and
reliance
features.
conduct
extensive
experiments
9
image
classifiers
various
datasets
including
ImageNet
demonstrate
these
properties
show
that
our
evade
state-of-the-art
defense.
We
propose
Februus;
a
new
idea
to
neutralize
highly
potent
and
insidious
Trojan
attacks
on
Deep
Neural
Network
(DNN)
systems
at
run-time.
In
attacks,
an
adversary
activates
backdoor
crafted
in
deep
neural
network
model
using
secret
trigger,
Trojan,
applied
any
input
alter
the
model's
decision
target
prediction---a
determined
by
only
known
attacker.
Februus
sanitizes
incoming
surgically
removing
potential
trigger
artifacts
restoring
for
classification
task.
enables
effective
mitigation
sanitizing
inputs
with
no
loss
of
performance
sanitized
inputs,
Trojaned
or
benign.
Our
extensive
evaluations
multiple
infected
models
based
four
popular
datasets
across
three
contrasting
vision
applications
types
demonstrate
high
efficacy
Februus.
dramatically
reduced
attack
success
rates
from
100%
near
0%
all
cases
(achieving
cases)
evaluated
generalizability
defend
against
complex
adaptive
attacks;
notably,
we
realized
first
defense
advanced
partial
attack.
To
best
our
knowledge,
is
method
operation
run-time
capable
without
requiring
anomaly
detection
methods,
retraining
costly
labeled
data.
Fanchao
Qi,
Yuan
Yao,
Sophia
Xu,
Zhiyuan
Liu,
Maosong
Sun.
Proceedings
of
the
59th
Annual
Meeting
Association
for
Computational
Linguistics
and
11th
International
Joint
Conference
on
Natural
Language
Processing
(Volume
1:
Long
Papers).
2021.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies,
Journal Year:
2021,
Volume and Issue:
unknown
Published: Jan. 1, 2021
Adversarial
attacks
alter
NLP
model
predictions
by
perturbing
test-time
inputs.
However,
it
is
much
less
understood
whether,
and
how,
can
be
manipulated
with
small,
concealed
changes
to
the
training
data.
In
this
work,
we
develop
a
new
data
poisoning
attack
that
allows
an
adversary
control
whenever
desired
trigger
phrase
present
in
input.
For
instance,
insert
50
poison
examples
into
sentiment
model’s
set
causes
frequently
predict
Positive
input
contains
“James
Bond”.
Crucially,
craft
these
using
gradient-based
procedure
so
they
do
not
mention
phrase.
We
also
apply
our
language
modeling
(“Apple
iPhone”
triggers
negative
generations)
machine
translation
(“iced
coffee”
mistranslated
as
“hot
coffee”).
conclude
proposing
three
defenses
mitigate
at
some
cost
prediction
accuracy
or
extra
human
annotation.