Black-box Detection of Backdoor Attacks with Limited Information and Data DOI
Yinpeng Dong, Xiao Yang,

Zhijie Deng

и другие.

2021 IEEE/CVF International Conference on Computer Vision (ICCV), Год журнала: 2021, Номер unknown

Опубликована: Окт. 1, 2021

Although deep neural networks (DNNs) have made rapid progress in recent years, they are vulnerable adversarial environments. A malicious backdoor could be embedded a model by poisoning the training dataset, whose intention is to make infected give wrong predictions during inference when specific trigger appears. To mitigate potential threats of attacks, various detection and defense methods been proposed. However, existing techniques usually require poisoned data or access white-box model, which commonly unavailable practice. In this paper, we propose black-box (B3D) method identify attacks with only query model. We introduce gradient-free optimization algorithm reverse-engineer for each class, helps reveal existence attacks. addition detection, also simple strategy reliable using identified backdoored models. Extensive experiments on hundreds DNN models trained several datasets corroborate effectiveness our under setting against

Язык: Английский

Backdoor Learning: A Survey DOI
Yiming Li, Yong Jiang, Zhifeng Li

и другие.

IEEE Transactions on Neural Networks and Learning Systems, Год журнала: 2022, Номер 35(1), С. 5 - 22

Опубликована: Июнь 22, 2022

Backdoor attack intends to embed hidden backdoors into deep neural networks (DNNs), so that the attacked models perform well on benign samples, whereas their predictions will be maliciously changed if backdoor is activated by attacker-specified triggers. This threat could happen when training process not fully controlled, such as third-party datasets or adopting models, which poses a new and realistic threat. Although learning an emerging rapidly growing research area, there still no comprehensive timely review of it. In this article, we present first survey realm. We summarize categorize existing attacks defenses based characteristics, provide unified framework for analyzing poisoning-based attacks. Besides, also analyze relation between relevant fields (i.e., adversarial data poisoning), widely adopted benchmark datasets. Finally, briefly outline certain future directions relying upon reviewed works. A curated list backdoor-related resources available at https://github.com/THUYimingLi/backdoor-learning-resources .

Язык: Английский

Процитировано

343

Explainable Deep Learning: A Field Guide for the Uninitiated DOI Creative Commons

Gabriëlle Ras,

Ning Xie, Marcel van Gerven

и другие.

Journal of Artificial Intelligence Research, Год журнала: 2022, Номер 73, С. 329 - 397

Опубликована: Янв. 25, 2022

Deep neural networks (DNNs) are an indispensable machine learning tool despite the difficulty of diagnosing what aspects a model’s input drive its decisions. In countless real-world domains, from legislation and law enforcement to healthcare, such diagnosis is essential ensure that DNN decisions driven by appropriate in context use. The development methods studies enabling explanation DNN’s has thus blossomed into active broad area research. field’s complexity exacerbated competing definitions it means “to explain” actions evaluate approach’s “ability explain”. This article offers field guide explore space explainable deep for those AI/ML who uninitiated. guide: i) Introduces three simple dimensions defining foundational contribute learning, ii) discusses evaluations model explanations, iii) places explainability other related research areas, iv) user-oriented design future directions. We hope seen as starting point embarking on this field.

Язык: Английский

Процитировано

298

Invisible Backdoor Attack with Sample-Specific Triggers DOI
Yuezun Li, Yiming Li, Baoyuan Wu

и другие.

2021 IEEE/CVF International Conference on Computer Vision (ICCV), Год журнала: 2021, Номер unknown, С. 16443 - 16452

Опубликована: Окт. 1, 2021

Recently, backdoor attacks pose a new security threat to the training process of deep neural networks (DNNs). Attackers intend inject hidden backdoors into DNNs, such that attacked model performs well on benign samples, whereas its prediction will be maliciously changed if are activated by attacker-defined trigger. Existing usually adopt setting triggers sample-agnostic, i.e., different poisoned samples contain same trigger, resulting in could easily mitigated current defenses. In this work, we explore novel attack paradigm, where sample-specific. our attack, only need modify certain with invisible perturbation, while not manipulate other components (e.g., loss, and structure) as required many existing attacks. Specifically, inspired recent advance DNN-based image steganography, generate sample-specific additive noises encoding an attacker-specified string images through encoder-decoder network. The mapping from target label generated when DNNs trained dataset. Extensive experiments benchmark datasets verify effectiveness method attacking models or without code available at https://github.com/yuezunli/ISSBA.

Язык: Английский

Процитировано

219

Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses DOI
Micah Goldblum,

Dimitris Tsipras,

Chulin Xie

и другие.

IEEE Transactions on Pattern Analysis and Machine Intelligence, Год журнала: 2022, Номер 45(2), С. 1563 - 1580

Опубликована: Март 25, 2022

As machine learning systems grow in scale, so do their training data requirements, forcing practitioners to automate and outsource the curation of order achieve state-of-the-art performance. The absence trustworthy human supervision over collection process exposes organizations security vulnerabilities; can be manipulated control degrade downstream behaviors learned models. goal this work is systematically categorize discuss a wide range dataset vulnerabilities exploits, approaches for defending against these threats, an array open problems space.

Язык: Английский

Процитировано

174

Dynamic Backdoor Attacks Against Machine Learning Models DOI
Ahmed Salem, Rui Wen, Michael Backes

и другие.

Опубликована: Июнь 1, 2022

Machine learning (ML) has made tremendous progress during the past decade and is being adopted in various critical real-world applications. However, recent research shown that ML models are vulnerable to multiple security privacy attacks. In particular, backdoor attacks against have recently raised a lot of awareness. A successful attack can cause severe consequences, such as allowing an adversary bypass authentication systems. Current backdooring techniques rely on adding static triggers (with fixed patterns locations) model inputs which prone detection by current mechanisms. this paper, we propose first class dynamic deep neural networks (DNN), namely Random Backdoor, Backdoor Generating Network (BaN), conditional (c-BaN). Triggers generated our random locations, reduce efficacy BaN c-BaN based novel generative network two schemes algorithmically generate triggers. Moreover, technique given target label, it target-specific trigger. Both essentially general framework renders flexibility for further customizing We extensively evaluate three benchmark datasets: MNIST, CelebA, CIFAR-10. Our achieve almost perfect performance back-doored data with negligible utility loss. show state-of-the-art defense mechanisms attacks, including ABS, Februus, MNTD, Neural Cleanse, STRIP.

Язык: Английский

Процитировано

144

LIRA: Learnable, Imperceptible and Robust Backdoor Attacks DOI
Khoa D. Doan, Yingjie Lao, Weijie Zhao

и другие.

2021 IEEE/CVF International Conference on Computer Vision (ICCV), Год журнала: 2021, Номер unknown, С. 11946 - 11956

Опубликована: Окт. 1, 2021

Recently, machine learning models have demonstrated to be vulnerable backdoor attacks, primarily due the lack of transparency in black-box such as deep neural networks. A third-party model can poisoned that it works adequately normal conditions but behaves maliciously on samples with specific trigger patterns. However, injection function is manually defined most existing attack methods, e.g., placing a small patch pixels an image or slightly deforming before poisoning model. This results two-stage approach sub-optimal success rate and complete stealthiness under human inspection.In this paper, we propose novel stealthy framework, LIRA, which jointly learns optimal, poisons We formulate objective non-convex, constrained optimization problem. Under generator will learn manipulate input imperceptible noise preserve performance clean data maximize data. Then, solve challenging problem efficient, stochastic procedure. Finally, proposed framework achieves 100% rates several benchmark datasets, including MNIST, CIFAR10, GTSRB, T-ImageNet, while simultaneously bypassing defense methods inspection.

Язык: Английский

Процитировано

127

A Comprehensive Survey on Poisoning Attacks and Countermeasures in Machine Learning DOI
Zhiyi Tian,

Lei Cui,

Jie Liang

и другие.

ACM Computing Surveys, Год журнала: 2022, Номер 55(8), С. 1 - 35

Опубликована: Июль 30, 2022

The prosperity of machine learning has been accompanied by increasing attacks on the training process. Among them, poisoning have become an emerging threat during model training. Poisoning profound impacts target models, e.g., making them unable to converge or manipulating their prediction results. Moreover, rapid development recent distributed frameworks, especially federated learning, further stimulated attacks. Defending against is challenging and urgent. However, systematic review from a unified perspective remains blank. This survey provides in-depth up-to-date overview corresponding countermeasures in both centralized learning. We firstly categorize attack methods based goals. Secondly, we offer detailed analysis differences connections among techniques. Furthermore, present different framework highlight advantages disadvantages. Finally, discuss reasons for feasibility address potential research directions defenses perspectives, separately.

Язык: Английский

Процитировано

127

Narcissus: A Practical Clean-Label Backdoor Attack with Limited Information DOI Creative Commons
Yi Zeng, Minzhou Pan, Hoang Anh Just

и другие.

Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, Год журнала: 2023, Номер unknown, С. 771 - 785

Опубликована: Ноя. 15, 2023

Backdoor attacks introduce manipulated data into a machine learning model's training set, causing the model to misclassify inputs with trigger during testing achieve desired outcome by attacker. For backdoor bypass human inspection, it is essential that injected appear be correctly labeled. The such property are often referred as "clean-label attacks." success of current clean-label methods largely depends on access complete set. Yet, accessing dataset challenging or unfeasible since frequently comes from varied, independent sources, like images distinct users. It remains question whether still present real threats.

Язык: Английский

Процитировано

119

BadEncoder: Backdoor Attacks to Pre-trained Encoders in Self-Supervised Learning DOI
Jinyuan Jia, Yupei Liu, Neil Zhenqiang Gong

и другие.

2022 IEEE Symposium on Security and Privacy (SP), Год журнала: 2022, Номер unknown, С. 2043 - 2059

Опубликована: Май 1, 2022

Self-supervised learning in computer vision aims to pre-train an image encoder using a large amount of unlabeled images or (image, text) pairs. The pre-trained can then be used as feature extractor build downstream classifiers for many tasks with small no labeled training data. In this work, we propose BadEncoder, the first backdoor attack self-supervised learning. particular, our BadEncoder injects backdoors into such that built based on backdoored different simultaneously inherit behavior. We formulate optimization problem and gradient descent method solve it, which produces from clean one. Our extensive empirical evaluation results multiple datasets show achieves high success rates while preserving accuracy classifiers. also effectiveness two publicly available, real-world encoders, i.e., Google's ImageNet OpenAI's Contrastive Language-Image Pre-training (CLIP) 400 million pairs collected Internet. Moreover, consider defenses including Neural Cleanse MNTD (empirical defenses) well PatchGuard (a provable defense). these are insufficient defend against highlighting needs new BadEncoder. code is available at: https://github.com/jjy1994/BadEncoder.

Язык: Английский

Процитировано

78

Wild Patterns Reloaded: A Survey of Machine Learning Security against Training Data Poisoning DOI Open Access
Antonio Emanuele Cinà, Kathrin Grosse, Ambra Demontis

и другие.

ACM Computing Surveys, Год журнала: 2023, Номер 55(13s), С. 1 - 39

Опубликована: Март 1, 2023

The success of machine learning is fueled by the increasing availability computing power and large training datasets. data used to learn new models or update existing ones, assuming that it sufficiently representative will be encountered at test time. This assumption challenged threat poisoning, an attack manipulates compromise model’s performance Although poisoning has been acknowledged as a relevant in industry applications, variety different attacks defenses have proposed so far, complete systematization critical review field still missing. In this survey, we provide comprehensive learning, reviewing more than 100 papers published past 15 years. We start categorizing current then organize accordingly. While focus mostly on computer-vision argue our also encompasses state-of-the-art for other modalities. Finally, discuss resources research shed light limitations open questions field.

Язык: Английский

Процитировано

68