2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),
Journal Year:
2022,
Volume and Issue:
unknown, P. 13358 - 13368
Published: June 1, 2022
Backdoor
attacks
aim
to
cause
misclassification
of
a
subject
model
by
stamping
trigger
inputs.
Backdoors
could
be
injected
through
malicious
training
and
naturally
exist.
Deriving
backdoor
for
is
critical
both
attack
defense.
A
popular
inversion
method
optimization.
Existing
methods
are
based
on
finding
smallest
that
can
uniformly
flip
set
input
samples
minimizing
mask.
The
mask
defines
the
pixels
ought
perturbed.
We
develop
new
optimization
directly
minimizes
individual
pixel
changes,
without
using
Our
experiments
show
compared
existing
methods,
one
generate
triggers
require
smaller
number
perturbed,
have
higher
success
rate,
more
robust.
They
hence
desirable
when
used
in
real-world
effective
also
cost-effective.
IEEE Transactions on Neural Networks and Learning Systems,
Journal Year:
2022,
Volume and Issue:
35(1), P. 5 - 22
Published: June 22, 2022
Backdoor
attack
intends
to
embed
hidden
backdoors
into
deep
neural
networks
(DNNs),
so
that
the
attacked
models
perform
well
on
benign
samples,
whereas
their
predictions
will
be
maliciously
changed
if
backdoor
is
activated
by
attacker-specified
triggers.
This
threat
could
happen
when
training
process
not
fully
controlled,
such
as
third-party
datasets
or
adopting
models,
which
poses
a
new
and
realistic
threat.
Although
learning
an
emerging
rapidly
growing
research
area,
there
still
no
comprehensive
timely
review
of
it.
In
this
article,
we
present
first
survey
realm.
We
summarize
categorize
existing
attacks
defenses
based
characteristics,
provide
unified
framework
for
analyzing
poisoning-based
attacks.
Besides,
also
analyze
relation
between
relevant
fields
(i.e.,
adversarial
data
poisoning),
widely
adopted
benchmark
datasets.
Finally,
briefly
outline
certain
future
directions
relying
upon
reviewed
works.
A
curated
list
backdoor-related
resources
available
at
https://github.com/THUYimingLi/backdoor-learning-resources
.
IEEE Transactions on Dependable and Secure Computing,
Journal Year:
2020,
Volume and Issue:
unknown, P. 1 - 1
Published: Jan. 1, 2020
Deep
neural
networks
(DNNs)
have
been
proven
vulnerable
to
backdoor
attacks,
where
hidden
features
(patterns)
trained
a
normal
model,
which
is
only
activated
by
some
specific
input
(called
triggers),
trick
the
model
into
producing
unexpected
behavior.
In
this
article,
we
create
covert
and
scattered
triggers
for
invisible
backdoors,
can
fool
both
DNN
models
human
inspection.
We
apply
our
backdoors
through
two
state-of-the-art
methods
of
embedding
attacks.
The
first
approach
on
Badnets
embeds
trigger
DNNs
steganography.
second
trojan
attack
uses
types
additional
regularization
terms
generate
with
irregular
shape
size.
use
Attack
Success
Rate
Functionality
measure
performance
introduce
novel
definitions
invisibility
perception;
one
conceptualized
Perceptual
Adversarial
Similarity
Score
(PASS)
other
Learned
Image
Patch
(LPIPS).
show
that
proposed
be
fairly
effective
across
various
as
well
four
datasets
MNIST,
CIFAR-10,
CIFAR-100,
GTSRB,
measuring
their
success
rates
adversary,
functionality
users,
scores
administrators.
finally
argue
attacks
effectively
thwart
detection
approaches.
IEEE Transactions on Pattern Analysis and Machine Intelligence,
Journal Year:
2022,
Volume and Issue:
45(2), P. 1563 - 1580
Published: March 25, 2022
As
machine
learning
systems
grow
in
scale,
so
do
their
training
data
requirements,
forcing
practitioners
to
automate
and
outsource
the
curation
of
order
achieve
state-of-the-art
performance.
The
absence
trustworthy
human
supervision
over
collection
process
exposes
organizations
security
vulnerabilities;
can
be
manipulated
control
degrade
downstream
behaviors
learned
models.
goal
this
work
is
systematically
categorize
discuss
a
wide
range
dataset
vulnerabilities
exploits,
approaches
for
defending
against
these
threats,
an
array
open
problems
space.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),
Journal Year:
2021,
Volume and Issue:
unknown
Published: June 1, 2021
Backdoor
attacks
embed
hidden
malicious
behaviors
into
deep
learning
models,
which
only
activate
and
cause
misclassifications
on
model
inputs
containing
a
specific
"trigger."
Existing
works
backdoor
defenses,
however,
mostly
focus
digital
that
apply
digitally
generated
patterns
as
triggers.
A
critical
question
remains
unanswered:
"can
succeed
using
physical
objects
triggers,
thus
making
them
credible
threat
against
systems
in
the
real
world?"We
conduct
detailed
empirical
study
to
explore
this
for
facial
recognition,
task.
Using
7
we
collect
custom
dataset
of
3205
images
10
volunteers
use
it
feasibility
"physical"
under
variety
real-world
conditions.
Our
reveals
two
key
findings.
First,
can
be
highly
successful
if
they
are
carefully
configured
overcome
constraints
imposed
by
objects.
In
particular,
placement
triggers
is
largely
constrained
target
model's
dependence
features.
Second,
four
today's
state-of-the-art
defenses
(digital)
backdoors
ineffective
backdoors,
because
breaks
core
assumptions
used
construct
these
defenses.Our
confirms
(physical)
not
hypothetical
phenomenon
but
rather
pose
serious
classification
tasks.
We
need
new
more
robust
world.
2022 IEEE Symposium on Security and Privacy (SP),
Journal Year:
2022,
Volume and Issue:
unknown, P. 2043 - 2059
Published: May 1, 2022
Self-supervised
learning
in
computer
vision
aims
to
pre-train
an
image
encoder
using
a
large
amount
of
unlabeled
images
or
(image,
text)
pairs.
The
pre-trained
can
then
be
used
as
feature
extractor
build
downstream
classifiers
for
many
tasks
with
small
no
labeled
training
data.
In
this
work,
we
propose
BadEncoder,
the
first
backdoor
attack
self-supervised
learning.
particular,
our
BadEncoder
injects
backdoors
into
such
that
built
based
on
backdoored
different
simultaneously
inherit
behavior.
We
formulate
optimization
problem
and
gradient
descent
method
solve
it,
which
produces
from
clean
one.
Our
extensive
empirical
evaluation
results
multiple
datasets
show
achieves
high
success
rates
while
preserving
accuracy
classifiers.
also
effectiveness
two
publicly
available,
real-world
encoders,
i.e.,
Google's
ImageNet
OpenAI's
Contrastive
Language-Image
Pre-training
(CLIP)
400
million
pairs
collected
Internet.
Moreover,
consider
defenses
including
Neural
Cleanse
MNTD
(empirical
defenses)
well
PatchGuard
(a
provable
defense).
these
are
insufficient
defend
against
highlighting
needs
new
BadEncoder.
code
is
available
at:
https://github.com/jjy1994/BadEncoder.
ACM Computing Surveys,
Journal Year:
2023,
Volume and Issue:
55(13s), P. 1 - 39
Published: March 1, 2023
The
success
of
machine
learning
is
fueled
by
the
increasing
availability
computing
power
and
large
training
datasets.
data
used
to
learn
new
models
or
update
existing
ones,
assuming
that
it
sufficiently
representative
will
be
encountered
at
test
time.
This
assumption
challenged
threat
poisoning,
an
attack
manipulates
compromise
model’s
performance
Although
poisoning
has
been
acknowledged
as
a
relevant
in
industry
applications,
variety
different
attacks
defenses
have
proposed
so
far,
complete
systematization
critical
review
field
still
missing.
In
this
survey,
we
provide
comprehensive
learning,
reviewing
more
than
100
papers
published
past
15
years.
We
start
categorizing
current
then
organize
accordingly.
While
focus
mostly
on
computer-vision
argue
our
also
encompasses
state-of-the-art
for
other
modalities.
Finally,
discuss
resources
research
shed
light
limitations
open
questions
field.
The
last
decade
of
machine
learning
has
seen
drastic
increases
in
scale
and
capabilities.
Deep
neural
networks
(DNNs)
are
increasingly
being
deployed
the
real
world.
However,
they
difficult
to
analyze,
raising
concerns
about
using
them
without
a
rigorous
understanding
how
function.
Effective
tools
for
interpreting
will
be
important
building
more
trustworthy
AI
by
helping
identify
problems,
fix
bugs,
improve
basic
understanding.
In
particular,
"inner"
interpretability
techniques,
which
focus
on
explaining
internal
components
DNNs,
well-suited
developing
mechanistic
understanding,
guiding
manual
modifications,
reverse
engineering
solutions.
Much
recent
work
focused
DNN
interpretability,
rapid
progress
thus
far
made
thorough
systematization
methods
difficult.
this
survey,
we
review
over
300
works
with
inner
tools.
We
introduce
taxonomy
that
classifies
what
part
network
help
explain
(weights,
neurons,
subnetworks,
or
latent
representations)
whether
implemented
during
(intrinsic)
after
(post
hoc)
training.
To
our
knowledge,
also
first
survey
number
connections
between
research
adversarial
robustness,
continual
learning,
modularity,
compression,
studying
human
visual
system.
discuss
key
challenges
argue
status
quo
is
largely
unproductive.
Finally,
highlight
importance
future
emphasizes
diagnostics,
debugging,
adversaries,
benchmarking
order
make
useful
engineers
practical
applications.
arXiv (Cornell University),
Journal Year:
2020,
Volume and Issue:
unknown
Published: Jan. 1, 2020
This
work
provides
the
community
with
a
timely
comprehensive
review
of
backdoor
attacks
and
countermeasures
on
deep
learning.
According
to
attacker's
capability
affected
stage
machine
learning
pipeline,
attack
surfaces
are
recognized
be
wide
then
formalized
into
six
categorizations:
code
poisoning,
outsourcing,
pretrained,
data
collection,
collaborative
post-deployment.
Accordingly,
under
each
categorization
combed.
The
categorized
four
general
classes:
blind
removal,
offline
inspection,
online
post
removal.
we
countermeasures,
compare
analyze
their
advantages
disadvantages.
We
have
also
reviewed
flip
side
attacks,
which
explored
for
i)
protecting
intellectual
property
models,
ii)
acting
as
honeypot
catch
adversarial
example
iii)
verifying
deletion
requested
by
contributor.Overall,
research
defense
is
far
behind
attack,
there
no
single
that
can
prevent
all
types
attacks.
In
some
cases,
an
attacker
intelligently
bypass
existing
defenses
adaptive
attack.
Drawing
insights
from
systematic
review,
present
key
areas
future
backdoor,
such
empirical
security
evaluations
physical
trigger
in
particular,
more
efficient
practical
solicited.
2021 IEEE/CVF International Conference on Computer Vision (ICCV),
Journal Year:
2021,
Volume and Issue:
unknown
Published: Oct. 1, 2021
Backdoor
attacks
have
been
considered
a
severe
security
threat
to
deep
learning.
Such
can
make
models
perform
abnormally
on
inputs
with
predefined
triggers
and
still
retain
state-of-the-art
performance
clean
data.
While
backdoor
thoroughly
investigated
in
the
image
domain
from
both
attackers'
defenders'
sides,
an
analysis
frequency
has
missing
thus
far.This
paper
first
revisits
existing
perspective
performs
comprehensive
analysis.
Our
results
show
that
many
current
exhibit
high-frequency
artifacts,
which
persist
across
different
datasets
resolutions.
We
further
demonstrate
these
artifacts
enable
simple
way
detect
at
detection
rate
of
98.50%
without
prior
knowledge
attack
details
target
model.
Acknowledging
previous
attacks'
weaknesses,
we
propose
practical
create
smooth
study
their
detectability.
defense
works
benefit
by
incorporating
into
design
consideration.
Moreover,
detector
tuned
over
stronger
generalize
well
unseen
weak
triggers.
In
short,
our
work
emphasizes
importance
considering
when
designing
defenses
arXiv (Cornell University),
Journal Year:
2020,
Volume and Issue:
unknown
Published: Jan. 1, 2020
Backdoor
attack
intends
to
inject
hidden
backdoor
into
the
deep
neural
networks
(DNNs),
such
that
prediction
of
infected
model
will
be
maliciously
changed
if
is
activated
by
attacker-defined
trigger,
while
it
performs
well
on
benign
samples.
Currently,
most
existing
attacks
adopted
setting
\emph{static}
$i.e.,$
triggers
across
training
and
testing
images
follow
same
appearance
are
located
in
area.
In
this
paper,
we
revisit
paradigm
analyzing
characteristics
static
trigger.
We
demonstrate
an
vulnerable
when
trigger
not
consistent
with
one
used
for
training.
further
explore
how
utilize
property
defense,
discuss
alleviate
vulnerability
attacks.