2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),
Год журнала:
2022,
Номер
unknown, С. 14983 - 14993
Опубликована: Июнь 1, 2022
Many
existing
backdoor
scanners
work
by
finding
a
small
and
fixed
trigger.
However,
advanced
attacks
have
large
pervasive
triggers,
rendering
less
effective.
We
develop
new
detection
method.
It
first
uses
trigger
inversion
technique
to
generate
namely,
universal
input
patterns
flipping
victim
class
samples
target
class.
then
checks
if
any
such
is
composed
of
features
that
are
not
natural
distinctive
between
the
classes.
based
on
novel
symmetric
feature
differencing
method
identifies
separating
two
sets
(e.g.,
from
respective
classes).
evaluate
number
including
composite
attack,
reflection
hidden
filter
also
traditional
patch
attack.
The
evaluation
thousands
models,
both
clean
trojaned
with
various
architectures.
compare
three
state-of-the-art
scanners.
Our
can
achieve
80-88%
accuracy
while
baselines
only
50-70%
complex
attacks.
results
TrojAI
competition
rounds
2–4,
which
backdoors
backdoors,
show
may
produce
hundreds
false
positives
(i.e.,
models
recognized
as
trojaned),
our
removes
78-100%
them
increase
negatives
0-30%,
leading
17-41%
overall
improvement.
This
allows
us
top
performance
leaderboard.
IEEE Transactions on Neural Networks and Learning Systems,
Год журнала:
2022,
Номер
35(1), С. 5 - 22
Опубликована: Июнь 22, 2022
Backdoor
attack
intends
to
embed
hidden
backdoors
into
deep
neural
networks
(DNNs),
so
that
the
attacked
models
perform
well
on
benign
samples,
whereas
their
predictions
will
be
maliciously
changed
if
backdoor
is
activated
by
attacker-specified
triggers.
This
threat
could
happen
when
training
process
not
fully
controlled,
such
as
third-party
datasets
or
adopting
models,
which
poses
a
new
and
realistic
threat.
Although
learning
an
emerging
rapidly
growing
research
area,
there
still
no
comprehensive
timely
review
of
it.
In
this
article,
we
present
first
survey
realm.
We
summarize
categorize
existing
attacks
defenses
based
characteristics,
provide
unified
framework
for
analyzing
poisoning-based
attacks.
Besides,
also
analyze
relation
between
relevant
fields
(i.e.,
adversarial
data
poisoning),
widely
adopted
benchmark
datasets.
Finally,
briefly
outline
certain
future
directions
relying
upon
reviewed
works.
A
curated
list
backdoor-related
resources
available
at
https://github.com/THUYimingLi/backdoor-learning-resources
.
IEEE Transactions on Neural Networks and Learning Systems,
Год журнала:
2022,
Номер
35(7), С. 8726 - 8746
Опубликована: Ноя. 10, 2022
As
data
are
increasingly
being
stored
in
different
silos
and
societies
becoming
more
aware
of
privacy
issues,
the
traditional
centralized
training
artificial
intelligence
(AI)
models
is
facing
efficiency
challenges.
Recently,
federated
learning
(FL)
has
emerged
as
an
alternative
solution
continues
to
thrive
this
new
reality.
Existing
FL
protocol
designs
have
been
shown
be
vulnerable
adversaries
within
or
outside
system,
compromising
system
robustness.
Besides
powerful
global
models,
it
paramount
importance
design
systems
that
guarantees
resistant
types
adversaries.
In
article,
we
conduct
a
comprehensive
survey
on
robustness
over
past
five
years.
Through
concise
introduction
concept
unique
taxonomy
covering:
1)
threat
models;
2)
attacks
defenses;
3)
poisoning
defenses,
provide
accessible
review
important
topic.
We
highlight
intuitions,
key
techniques,
fundamental
assumptions
adopted
by
various
defenses.
Finally,
discuss
promising
future
research
directions
toward
robust
privacy-preserving
FL,
their
interplays
with
multidisciplinary
goals
FL.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),
Год журнала:
2021,
Номер
unknown
Опубликована: Июнь 1, 2021
Backdoor
attacks
embed
hidden
malicious
behaviors
into
deep
learning
models,
which
only
activate
and
cause
misclassifications
on
model
inputs
containing
a
specific
"trigger."
Existing
works
backdoor
defenses,
however,
mostly
focus
digital
that
apply
digitally
generated
patterns
as
triggers.
A
critical
question
remains
unanswered:
"can
succeed
using
physical
objects
triggers,
thus
making
them
credible
threat
against
systems
in
the
real
world?"We
conduct
detailed
empirical
study
to
explore
this
for
facial
recognition,
task.
Using
7
we
collect
custom
dataset
of
3205
images
10
volunteers
use
it
feasibility
"physical"
under
variety
real-world
conditions.
Our
reveals
two
key
findings.
First,
can
be
highly
successful
if
they
are
carefully
configured
overcome
constraints
imposed
by
objects.
In
particular,
placement
triggers
is
largely
constrained
target
model's
dependence
features.
Second,
four
today's
state-of-the-art
defenses
(digital)
backdoors
ineffective
backdoors,
because
breaks
core
assumptions
used
construct
these
defenses.Our
confirms
(physical)
not
hypothetical
phenomenon
but
rather
pose
serious
classification
tasks.
We
need
new
more
robust
world.
2021 IEEE/CVF International Conference on Computer Vision (ICCV),
Год журнала:
2021,
Номер
unknown, С. 11946 - 11956
Опубликована: Окт. 1, 2021
Recently,
machine
learning
models
have
demonstrated
to
be
vulnerable
backdoor
attacks,
primarily
due
the
lack
of
transparency
in
black-box
such
as
deep
neural
networks.
A
third-party
model
can
poisoned
that
it
works
adequately
normal
conditions
but
behaves
maliciously
on
samples
with
specific
trigger
patterns.
However,
injection
function
is
manually
defined
most
existing
attack
methods,
e.g.,
placing
a
small
patch
pixels
an
image
or
slightly
deforming
before
poisoning
model.
This
results
two-stage
approach
sub-optimal
success
rate
and
complete
stealthiness
under
human
inspection.In
this
paper,
we
propose
novel
stealthy
framework,
LIRA,
which
jointly
learns
optimal,
poisons
We
formulate
objective
non-convex,
constrained
optimization
problem.
Under
generator
will
learn
manipulate
input
imperceptible
noise
preserve
performance
clean
data
maximize
data.
Then,
solve
challenging
problem
efficient,
stochastic
procedure.
Finally,
proposed
framework
achieves
100%
rates
several
benchmark
datasets,
including
MNIST,
CIFAR10,
GTSRB,
T-ImageNet,
while
simultaneously
bypassing
defense
methods
inspection.
arXiv (Cornell University),
Год журнала:
2021,
Номер
unknown
Опубликована: Янв. 1, 2021
Deep
neural
networks
(DNNs)
are
known
vulnerable
to
backdoor
attacks,
a
training
time
attack
that
injects
trigger
pattern
into
small
proportion
of
data
so
as
control
the
model's
prediction
at
test
time.
Backdoor
attacks
notably
dangerous
since
they
do
not
affect
performance
on
clean
examples,
yet
can
fool
model
make
incorrect
whenever
appears
during
testing.
In
this
paper,
we
propose
novel
defense
framework
Neural
Attention
Distillation
(NAD)
erase
triggers
from
backdoored
DNNs.
NAD
utilizes
teacher
network
guide
finetuning
student
subset
such
intermediate-layer
attention
aligns
with
network.
The
be
obtained
by
an
independent
process
same
subset.
We
empirically
show,
against
6
state-of-the-art
effectively
using
only
5\%
without
causing
obvious
degradation
examples.
Code
is
available
in
https://github.com/bboylyg/NAD.
ACM Computing Surveys,
Год журнала:
2023,
Номер
55(13s), С. 1 - 39
Опубликована: Март 1, 2023
The
success
of
machine
learning
is
fueled
by
the
increasing
availability
computing
power
and
large
training
datasets.
data
used
to
learn
new
models
or
update
existing
ones,
assuming
that
it
sufficiently
representative
will
be
encountered
at
test
time.
This
assumption
challenged
threat
poisoning,
an
attack
manipulates
compromise
model’s
performance
Although
poisoning
has
been
acknowledged
as
a
relevant
in
industry
applications,
variety
different
attacks
defenses
have
proposed
so
far,
complete
systematization
critical
review
field
still
missing.
In
this
survey,
we
provide
comprehensive
learning,
reviewing
more
than
100
papers
published
past
15
years.
We
start
categorizing
current
then
organize
accordingly.
While
focus
mostly
on
computer-vision
argue
our
also
encompasses
state-of-the-art
for
other
modalities.
Finally,
discuss
resources
research
shed
light
limitations
open
questions
field.
arXiv (Cornell University),
Год журнала:
2020,
Номер
unknown
Опубликована: Янв. 1, 2020
This
work
provides
the
community
with
a
timely
comprehensive
review
of
backdoor
attacks
and
countermeasures
on
deep
learning.
According
to
attacker's
capability
affected
stage
machine
learning
pipeline,
attack
surfaces
are
recognized
be
wide
then
formalized
into
six
categorizations:
code
poisoning,
outsourcing,
pretrained,
data
collection,
collaborative
post-deployment.
Accordingly,
under
each
categorization
combed.
The
categorized
four
general
classes:
blind
removal,
offline
inspection,
online
post
removal.
we
countermeasures,
compare
analyze
their
advantages
disadvantages.
We
have
also
reviewed
flip
side
attacks,
which
explored
for
i)
protecting
intellectual
property
models,
ii)
acting
as
honeypot
catch
adversarial
example
iii)
verifying
deletion
requested
by
contributor.Overall,
research
defense
is
far
behind
attack,
there
no
single
that
can
prevent
all
types
attacks.
In
some
cases,
an
attacker
intelligently
bypass
existing
defenses
adaptive
attack.
Drawing
insights
from
systematic
review,
present
key
areas
future
backdoor,
such
empirical
security
evaluations
physical
trigger
in
particular,
more
efficient
practical
solicited.
Proceedings of the AAAI Conference on Artificial Intelligence,
Год журнала:
2021,
Номер
35(2), С. 1148 - 1156
Опубликована: Май 18, 2021
Trojan
(backdoor)
attack
is
a
form
of
adversarial
on
deep
neural
networks
where
the
attacker
provides
victims
with
model
trained/retrained
malicious
data.
The
backdoor
can
be
activated
when
normal
input
stamped
certain
pattern
called
trigger,
causing
misclassification.
Many
existing
trojan
attacks
have
their
triggers
being
space
patches/objects
(e.g.,
polygon
solid
color)
or
simple
transformations
such
as
Instagram
filters.
These
are
susceptible
to
recent
detection
algorithms.
We
propose
novel
feature
five
characteristics:
effectiveness,
stealthiness,
controllability,
robustness
and
reliance
features.
conduct
extensive
experiments
9
image
classifiers
various
datasets
including
ImageNet
demonstrate
these
properties
show
that
our
evade
state-of-the-art
defense.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies,
Год журнала:
2021,
Номер
unknown, С. 2048 - 2058
Опубликована: Янв. 1, 2021
Wenkai
Yang,
Lei
Li,
Zhiyuan
Zhang,
Xuancheng
Ren,
Xu
Sun,
Bin
He.
Proceedings
of
the
2021
Conference
North
American
Chapter
Association
for
Computational
Linguistics:
Human
Language
Technologies.
2021.
arXiv (Cornell University),
Год журнала:
2021,
Номер
unknown
Опубликована: Янв. 1, 2021
Backdoor
attack
has
emerged
as
a
major
security
threat
to
deep
neural
networks
(DNNs).
While
existing
defense
methods
have
demonstrated
promising
results
on
detecting
or
erasing
backdoors,
it
is
still
not
clear
whether
robust
training
can
be
devised
prevent
the
backdoor
triggers
being
injected
into
trained
model
in
first
place.
In
this
paper,
we
introduce
concept
of
\emph{anti-backdoor
learning},
aiming
train
\emph{clean}
models
given
backdoor-poisoned
data.
We
frame
overall
learning
process
dual-task
and
\emph{backdoor}
portions
From
view,
identify
two
inherent
characteristics
attacks
their
weaknesses:
1)
learn
backdoored
data
much
faster
than
with
clean
data,
stronger
converges
data;
2)
task
tied
specific
class
(the
target
class).
Based
these
weaknesses,
propose
general
scheme,
Anti-Backdoor
Learning
(ABL),
automatically
during
training.
ABL
introduces
two-stage
\emph{gradient
ascent}
mechanism
for
standard
help
isolate
examples
at
an
early
stage,
break
correlation
between
later
stage.
Through
extensive
experiments
multiple
benchmark
datasets
against
10
state-of-the-art
attacks,
empirically
show
that
ABL-trained
achieve
same
performance
they
were
purely
Code
available
\url{https://github.com/bboylyg/ABL}.