2021 IEEE/CVF International Conference on Computer Vision (ICCV),
Год журнала:
2021,
Номер
unknown
Опубликована: Окт. 1, 2021
Although
deep
neural
networks
(DNNs)
have
made
rapid
progress
in
recent
years,
they
are
vulnerable
adversarial
environments.
A
malicious
backdoor
could
be
embedded
a
model
by
poisoning
the
training
dataset,
whose
intention
is
to
make
infected
give
wrong
predictions
during
inference
when
specific
trigger
appears.
To
mitigate
potential
threats
of
attacks,
various
detection
and
defense
methods
been
proposed.
However,
existing
techniques
usually
require
poisoned
data
or
access
white-box
model,
which
commonly
unavailable
practice.
In
this
paper,
we
propose
black-box
(B3D)
method
identify
attacks
with
only
query
model.
We
introduce
gradient-free
optimization
algorithm
reverse-engineer
for
each
class,
helps
reveal
existence
attacks.
addition
detection,
also
simple
strategy
reliable
using
identified
backdoored
models.
Extensive
experiments
on
hundreds
DNN
models
trained
several
datasets
corroborate
effectiveness
our
under
setting
against
IEEE Transactions on Neural Networks and Learning Systems,
Год журнала:
2022,
Номер
35(1), С. 5 - 22
Опубликована: Июнь 22, 2022
Backdoor
attack
intends
to
embed
hidden
backdoors
into
deep
neural
networks
(DNNs),
so
that
the
attacked
models
perform
well
on
benign
samples,
whereas
their
predictions
will
be
maliciously
changed
if
backdoor
is
activated
by
attacker-specified
triggers.
This
threat
could
happen
when
training
process
not
fully
controlled,
such
as
third-party
datasets
or
adopting
models,
which
poses
a
new
and
realistic
threat.
Although
learning
an
emerging
rapidly
growing
research
area,
there
still
no
comprehensive
timely
review
of
it.
In
this
article,
we
present
first
survey
realm.
We
summarize
categorize
existing
attacks
defenses
based
characteristics,
provide
unified
framework
for
analyzing
poisoning-based
attacks.
Besides,
also
analyze
relation
between
relevant
fields
(i.e.,
adversarial
data
poisoning),
widely
adopted
benchmark
datasets.
Finally,
briefly
outline
certain
future
directions
relying
upon
reviewed
works.
A
curated
list
backdoor-related
resources
available
at
https://github.com/THUYimingLi/backdoor-learning-resources
.
Journal of Artificial Intelligence Research,
Год журнала:
2022,
Номер
73, С. 329 - 397
Опубликована: Янв. 25, 2022
Deep
neural
networks
(DNNs)
are
an
indispensable
machine
learning
tool
despite
the
difficulty
of
diagnosing
what
aspects
a
model’s
input
drive
its
decisions.
In
countless
real-world
domains,
from
legislation
and
law
enforcement
to
healthcare,
such
diagnosis
is
essential
ensure
that
DNN
decisions
driven
by
appropriate
in
context
use.
The
development
methods
studies
enabling
explanation
DNN’s
has
thus
blossomed
into
active
broad
area
research.
field’s
complexity
exacerbated
competing
definitions
it
means
“to
explain”
actions
evaluate
approach’s
“ability
explain”.
This
article
offers
field
guide
explore
space
explainable
deep
for
those
AI/ML
who
uninitiated.
guide:
i)
Introduces
three
simple
dimensions
defining
foundational
contribute
learning,
ii)
discusses
evaluations
model
explanations,
iii)
places
explainability
other
related
research
areas,
iv)
user-oriented
design
future
directions.
We
hope
seen
as
starting
point
embarking
on
this
field.
2021 IEEE/CVF International Conference on Computer Vision (ICCV),
Год журнала:
2021,
Номер
unknown, С. 16443 - 16452
Опубликована: Окт. 1, 2021
Recently,
backdoor
attacks
pose
a
new
security
threat
to
the
training
process
of
deep
neural
networks
(DNNs).
Attackers
intend
inject
hidden
backdoors
into
DNNs,
such
that
attacked
model
performs
well
on
benign
samples,
whereas
its
prediction
will
be
maliciously
changed
if
are
activated
by
attacker-defined
trigger.
Existing
usually
adopt
setting
triggers
sample-agnostic,
i.e.,
different
poisoned
samples
contain
same
trigger,
resulting
in
could
easily
mitigated
current
defenses.
In
this
work,
we
explore
novel
attack
paradigm,
where
sample-specific.
our
attack,
only
need
modify
certain
with
invisible
perturbation,
while
not
manipulate
other
components
(e.g.,
loss,
and
structure)
as
required
many
existing
attacks.
Specifically,
inspired
recent
advance
DNN-based
image
steganography,
generate
sample-specific
additive
noises
encoding
an
attacker-specified
string
images
through
encoder-decoder
network.
The
mapping
from
target
label
generated
when
DNNs
trained
dataset.
Extensive
experiments
benchmark
datasets
verify
effectiveness
method
attacking
models
or
without
code
available
at
https://github.com/yuezunli/ISSBA.
IEEE Transactions on Pattern Analysis and Machine Intelligence,
Год журнала:
2022,
Номер
45(2), С. 1563 - 1580
Опубликована: Март 25, 2022
As
machine
learning
systems
grow
in
scale,
so
do
their
training
data
requirements,
forcing
practitioners
to
automate
and
outsource
the
curation
of
order
achieve
state-of-the-art
performance.
The
absence
trustworthy
human
supervision
over
collection
process
exposes
organizations
security
vulnerabilities;
can
be
manipulated
control
degrade
downstream
behaviors
learned
models.
goal
this
work
is
systematically
categorize
discuss
a
wide
range
dataset
vulnerabilities
exploits,
approaches
for
defending
against
these
threats,
an
array
open
problems
space.
Machine
learning
(ML)
has
made
tremendous
progress
during
the
past
decade
and
is
being
adopted
in
various
critical
real-world
applications.
However,
recent
research
shown
that
ML
models
are
vulnerable
to
multiple
security
privacy
attacks.
In
particular,
backdoor
attacks
against
have
recently
raised
a
lot
of
awareness.
A
successful
attack
can
cause
severe
consequences,
such
as
allowing
an
adversary
bypass
authentication
systems.
Current
backdooring
techniques
rely
on
adding
static
triggers
(with
fixed
patterns
locations)
model
inputs
which
prone
detection
by
current
mechanisms.
this
paper,
we
propose
first
class
dynamic
deep
neural
networks
(DNN),
namely
Random
Backdoor,
Backdoor
Generating
Network
(BaN),
conditional
(c-BaN).
Triggers
generated
our
random
locations,
reduce
efficacy
BaN
c-BaN
based
novel
generative
network
two
schemes
algorithmically
generate
triggers.
Moreover,
technique
given
target
label,
it
target-specific
trigger.
Both
essentially
general
framework
renders
flexibility
for
further
customizing
We
extensively
evaluate
three
benchmark
datasets:
MNIST,
CelebA,
CIFAR-10.
Our
achieve
almost
perfect
performance
back-doored
data
with
negligible
utility
loss.
show
state-of-the-art
defense
mechanisms
attacks,
including
ABS,
Februus,
MNTD,
Neural
Cleanse,
STRIP.
2021 IEEE/CVF International Conference on Computer Vision (ICCV),
Год журнала:
2021,
Номер
unknown, С. 11946 - 11956
Опубликована: Окт. 1, 2021
Recently,
machine
learning
models
have
demonstrated
to
be
vulnerable
backdoor
attacks,
primarily
due
the
lack
of
transparency
in
black-box
such
as
deep
neural
networks.
A
third-party
model
can
poisoned
that
it
works
adequately
normal
conditions
but
behaves
maliciously
on
samples
with
specific
trigger
patterns.
However,
injection
function
is
manually
defined
most
existing
attack
methods,
e.g.,
placing
a
small
patch
pixels
an
image
or
slightly
deforming
before
poisoning
model.
This
results
two-stage
approach
sub-optimal
success
rate
and
complete
stealthiness
under
human
inspection.In
this
paper,
we
propose
novel
stealthy
framework,
LIRA,
which
jointly
learns
optimal,
poisons
We
formulate
objective
non-convex,
constrained
optimization
problem.
Under
generator
will
learn
manipulate
input
imperceptible
noise
preserve
performance
clean
data
maximize
data.
Then,
solve
challenging
problem
efficient,
stochastic
procedure.
Finally,
proposed
framework
achieves
100%
rates
several
benchmark
datasets,
including
MNIST,
CIFAR10,
GTSRB,
T-ImageNet,
while
simultaneously
bypassing
defense
methods
inspection.
ACM Computing Surveys,
Год журнала:
2022,
Номер
55(8), С. 1 - 35
Опубликована: Июль 30, 2022
The
prosperity
of
machine
learning
has
been
accompanied
by
increasing
attacks
on
the
training
process.
Among
them,
poisoning
have
become
an
emerging
threat
during
model
training.
Poisoning
profound
impacts
target
models,
e.g.,
making
them
unable
to
converge
or
manipulating
their
prediction
results.
Moreover,
rapid
development
recent
distributed
frameworks,
especially
federated
learning,
further
stimulated
attacks.
Defending
against
is
challenging
and
urgent.
However,
systematic
review
from
a
unified
perspective
remains
blank.
This
survey
provides
in-depth
up-to-date
overview
corresponding
countermeasures
in
both
centralized
learning.
We
firstly
categorize
attack
methods
based
goals.
Secondly,
we
offer
detailed
analysis
differences
connections
among
techniques.
Furthermore,
present
different
framework
highlight
advantages
disadvantages.
Finally,
discuss
reasons
for
feasibility
address
potential
research
directions
defenses
perspectives,
separately.
Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security,
Год журнала:
2023,
Номер
unknown, С. 771 - 785
Опубликована: Ноя. 15, 2023
Backdoor
attacks
introduce
manipulated
data
into
a
machine
learning
model's
training
set,
causing
the
model
to
misclassify
inputs
with
trigger
during
testing
achieve
desired
outcome
by
attacker.
For
backdoor
bypass
human
inspection,
it
is
essential
that
injected
appear
be
correctly
labeled.
The
such
property
are
often
referred
as
"clean-label
attacks."
success
of
current
clean-label
methods
largely
depends
on
access
complete
set.
Yet,
accessing
dataset
challenging
or
unfeasible
since
frequently
comes
from
varied,
independent
sources,
like
images
distinct
users.
It
remains
question
whether
still
present
real
threats.
2022 IEEE Symposium on Security and Privacy (SP),
Год журнала:
2022,
Номер
unknown, С. 2043 - 2059
Опубликована: Май 1, 2022
Self-supervised
learning
in
computer
vision
aims
to
pre-train
an
image
encoder
using
a
large
amount
of
unlabeled
images
or
(image,
text)
pairs.
The
pre-trained
can
then
be
used
as
feature
extractor
build
downstream
classifiers
for
many
tasks
with
small
no
labeled
training
data.
In
this
work,
we
propose
BadEncoder,
the
first
backdoor
attack
self-supervised
learning.
particular,
our
BadEncoder
injects
backdoors
into
such
that
built
based
on
backdoored
different
simultaneously
inherit
behavior.
We
formulate
optimization
problem
and
gradient
descent
method
solve
it,
which
produces
from
clean
one.
Our
extensive
empirical
evaluation
results
multiple
datasets
show
achieves
high
success
rates
while
preserving
accuracy
classifiers.
also
effectiveness
two
publicly
available,
real-world
encoders,
i.e.,
Google's
ImageNet
OpenAI's
Contrastive
Language-Image
Pre-training
(CLIP)
400
million
pairs
collected
Internet.
Moreover,
consider
defenses
including
Neural
Cleanse
MNTD
(empirical
defenses)
well
PatchGuard
(a
provable
defense).
these
are
insufficient
defend
against
highlighting
needs
new
BadEncoder.
code
is
available
at:
https://github.com/jjy1994/BadEncoder.
ACM Computing Surveys,
Год журнала:
2023,
Номер
55(13s), С. 1 - 39
Опубликована: Март 1, 2023
The
success
of
machine
learning
is
fueled
by
the
increasing
availability
computing
power
and
large
training
datasets.
data
used
to
learn
new
models
or
update
existing
ones,
assuming
that
it
sufficiently
representative
will
be
encountered
at
test
time.
This
assumption
challenged
threat
poisoning,
an
attack
manipulates
compromise
model’s
performance
Although
poisoning
has
been
acknowledged
as
a
relevant
in
industry
applications,
variety
different
attacks
defenses
have
proposed
so
far,
complete
systematization
critical
review
field
still
missing.
In
this
survey,
we
provide
comprehensive
learning,
reviewing
more
than
100
papers
published
past
15
years.
We
start
categorizing
current
then
organize
accordingly.
While
focus
mostly
on
computer-vision
argue
our
also
encompasses
state-of-the-art
for
other
modalities.
Finally,
discuss
resources
research
shed
light
limitations
open
questions
field.