Invisible Backdoor Attacks Using Data Poisoning in Frequency Domain DOI Creative Commons
Chang Yue, Peizhuo Lv, Ruigang Liang

et al.

Frontiers in artificial intelligence and applications, Journal Year: 2023, Volume and Issue: unknown

Published: Sept. 28, 2023

Backdoor attacks have become a significant threat to deep neural networks (DNNs), whereby poisoned models perform well on benign samples but produce incorrect outputs when given specific inputs with trigger. These are usually implemented through data poisoning by injecting (samples patched trigger and mislabelled the target label) into dataset, trained that dataset will be infected backdoor. However, most current backdoor lack stealthiness robustness because of fixed patterns mislabelling, which humans or some defense approach can easily detect. To address this issue, we propose frequency-domain-based attack method implements implantation without mislabeling accessing training process. We evaluated our four benchmark datasets two popular scenarios: no-label self-supervised clean-label supervised learning. The experimental results demonstrate achieved high success rate (above 90%) all tasks performance degradation main robust against mainstream approaches.

Language: Английский

Backdoor Learning: A Survey DOI
Yiming Li, Yong Jiang, Zhifeng Li

et al.

IEEE Transactions on Neural Networks and Learning Systems, Journal Year: 2022, Volume and Issue: 35(1), P. 5 - 22

Published: June 22, 2022

Backdoor attack intends to embed hidden backdoors into deep neural networks (DNNs), so that the attacked models perform well on benign samples, whereas their predictions will be maliciously changed if backdoor is activated by attacker-specified triggers. This threat could happen when training process not fully controlled, such as third-party datasets or adopting models, which poses a new and realistic threat. Although learning an emerging rapidly growing research area, there still no comprehensive timely review of it. In this article, we present first survey realm. We summarize categorize existing attacks defenses based characteristics, provide unified framework for analyzing poisoning-based attacks. Besides, also analyze relation between relevant fields (i.e., adversarial data poisoning), widely adopted benchmark datasets. Finally, briefly outline certain future directions relying upon reviewed works. A curated list backdoor-related resources available at https://github.com/THUYimingLi/backdoor-learning-resources .

Language: Английский

Citations

344

Reflection Backdoor: A Natural Backdoor Attack on Deep Neural Networks DOI
Yunfei Liu, Xingjun Ma, James Bailey

et al.

Lecture notes in computer science, Journal Year: 2020, Volume and Issue: unknown, P. 182 - 199

Published: Jan. 1, 2020

Language: Английский

Citations

330

Privacy and Robustness in Federated Learning: Attacks and Defenses DOI
Lingjuan Lyu, Han Yu, Xingjun Ma

et al.

IEEE Transactions on Neural Networks and Learning Systems, Journal Year: 2022, Volume and Issue: 35(7), P. 8726 - 8746

Published: Nov. 10, 2022

As data are increasingly being stored in different silos and societies becoming more aware of privacy issues, the traditional centralized training artificial intelligence (AI) models is facing efficiency challenges. Recently, federated learning (FL) has emerged as an alternative solution continues to thrive this new reality. Existing FL protocol designs have been shown be vulnerable adversaries within or outside system, compromising system robustness. Besides powerful global models, it paramount importance design systems that guarantees resistant types adversaries. In article, we conduct a comprehensive survey on robustness over past five years. Through concise introduction concept unique taxonomy covering: 1) threat models; 2) attacks defenses; 3) poisoning defenses, provide accessible review important topic. We highlight intuitions, key techniques, fundamental assumptions adopted by various defenses. Finally, discuss promising future research directions toward robust privacy-preserving FL, their interplays with multidisciplinary goals FL.

Language: Английский

Citations

228

Input-Aware Dynamic Backdoor Attack DOI Creative Commons

Anh Nguyen,

Anh Tran

arXiv (Cornell University), Journal Year: 2020, Volume and Issue: unknown

Published: Jan. 1, 2020

In recent years, neural backdoor attack has been considered to be a potential security threat deep learning systems. Such systems, while achieving the state-of-the-art performance on clean data, perform abnormally inputs with predefined triggers. Current techniques, however, rely uniform trigger patterns, which are easily detected and mitigated by current defense methods. this work, we propose novel technique in triggers vary from input input. To achieve goal, implement an input-aware generator driven diversity loss. A cross-trigger test is applied enforce nonreusablity, making verification impossible. Experiments show that our method efficient various scenarios as well multiple datasets. We further demonstrate can bypass state of art An analysis famous network inspector again proves stealthiness proposed attack. Our code publicly available at https://github.com/VinAIResearch/input-aware-backdoor-attack-release.

Language: Английский

Citations

146

Machine Learning Security: Threats, Countermeasures, and Evaluations DOI Creative Commons
Mingfu Xue,

Chengxiang Yuan,

Heyi Wu

et al.

IEEE Access, Journal Year: 2020, Volume and Issue: 8, P. 74720 - 74742

Published: Jan. 1, 2020

Machine learning has been pervasively used in a wide range of applications due to its technical breakthroughs recent years. It demonstrated significant success dealing with various complex problems, and shows capabilities close humans or even beyond humans. However, studies show that machine models are vulnerable attacks, which will compromise the security themselves application systems. Moreover, such attacks stealthy unexplained nature deep models. In this survey, we systematically analyze issues learning, focusing on existing systems, corresponding defenses secure techniques, evaluation methods. Instead one stage type attack, paper covers all aspects from training phase test phase. First, model presence adversaries is presented, reasons why can be attacked analyzed. Then, security-related classified into five categories: set poisoning; backdoors set; adversarial example attacks; theft; recovery sensitive data. The threat models, attack approaches, defense techniques analyzed systematically. To demonstrate these threats real concerns physical world, also reviewed real-world conditions. Several suggestions evaluations systems provided. Last, future directions for presented.

Language: Английский

Citations

145

WaNet -- Imperceptible Warping-based Backdoor Attack DOI Creative Commons

Anh Nguyen,

Anh Tran

arXiv (Cornell University), Journal Year: 2021, Volume and Issue: unknown

Published: Jan. 1, 2021

With the thriving of deep learning and widespread practice using pre-trained networks, backdoor attacks have become an increasing security threat drawing many research interests in recent years. A third-party model can be poisoned training to work well normal conditions but behave maliciously when a trigger pattern appears. However, existing are all built on noise perturbation triggers, making them noticeable humans. In this paper, we instead propose warping-based triggers. The proposed outperforms previous methods human inspection test by wide margin, proving its stealthiness. To make such models undetectable machine defenders, novel mode, called ``noise mode. trained networks successfully attack bypass state-of-the-art defense standard classification datasets, including MNIST, CIFAR-10, GTSRB, CelebA. Behavior analyses show that our backdoors transparent network inspection, further mechanism's efficiency.

Language: Английский

Citations

117

BadEncoder: Backdoor Attacks to Pre-trained Encoders in Self-Supervised Learning DOI
Jinyuan Jia, Yupei Liu, Neil Zhenqiang Gong

et al.

2022 IEEE Symposium on Security and Privacy (SP), Journal Year: 2022, Volume and Issue: unknown, P. 2043 - 2059

Published: May 1, 2022

Self-supervised learning in computer vision aims to pre-train an image encoder using a large amount of unlabeled images or (image, text) pairs. The pre-trained can then be used as feature extractor build downstream classifiers for many tasks with small no labeled training data. In this work, we propose BadEncoder, the first backdoor attack self-supervised learning. particular, our BadEncoder injects backdoors into such that built based on backdoored different simultaneously inherit behavior. We formulate optimization problem and gradient descent method solve it, which produces from clean one. Our extensive empirical evaluation results multiple datasets show achieves high success rates while preserving accuracy classifiers. also effectiveness two publicly available, real-world encoders, i.e., Google's ImageNet OpenAI's Contrastive Language-Image Pre-training (CLIP) 400 million pairs collected Internet. Moreover, consider defenses including Neural Cleanse MNTD (empirical defenses) well PatchGuard (a provable defense). these are insufficient defend against highlighting needs new BadEncoder. code is available at: https://github.com/jjy1994/BadEncoder.

Language: Английский

Citations

78

Backdoor Attacks and Countermeasures on Deep Learning: A Comprehensive Review DOI Creative Commons
Yansong Gao, Bao Gia Doan, Zhi Zhang

et al.

arXiv (Cornell University), Journal Year: 2020, Volume and Issue: unknown

Published: Jan. 1, 2020

This work provides the community with a timely comprehensive review of backdoor attacks and countermeasures on deep learning. According to attacker's capability affected stage machine learning pipeline, attack surfaces are recognized be wide then formalized into six categorizations: code poisoning, outsourcing, pretrained, data collection, collaborative post-deployment. Accordingly, under each categorization combed. The categorized four general classes: blind removal, offline inspection, online post removal. we countermeasures, compare analyze their advantages disadvantages. We have also reviewed flip side attacks, which explored for i) protecting intellectual property models, ii) acting as honeypot catch adversarial example iii) verifying deletion requested by contributor.Overall, research defense is far behind attack, there no single that can prevent all types attacks. In some cases, an attacker intelligently bypass existing defenses adaptive attack. Drawing insights from systematic review, present key areas future backdoor, such empirical security evaluations physical trigger in particular, more efficient practical solicited.

Language: Английский

Citations

122

Rethinking the Trigger of Backdoor Attack DOI Creative Commons
Yiming Li,

Tongqing Zhai,

Baoyuan Wu

et al.

arXiv (Cornell University), Journal Year: 2020, Volume and Issue: unknown

Published: Jan. 1, 2020

Backdoor attack intends to inject hidden backdoor into the deep neural networks (DNNs), such that prediction of infected model will be maliciously changed if is activated by attacker-defined trigger, while it performs well on benign samples. Currently, most existing attacks adopted setting \emph{static} $i.e.,$ triggers across training and testing images follow same appearance are located in area. In this paper, we revisit paradigm analyzing characteristics static trigger. We demonstrate an vulnerable when trigger not consistent with one used for training. further explore how utilize property defense, discuss alleviate vulnerability attacks.

Language: Английский

Citations

98

Deep Feature Space Trojan Attack of Neural Networks by Controlled Detoxification DOI Open Access
Siyuan Cheng, Yingqi Liu, Shiqing Ma

et al.

Proceedings of the AAAI Conference on Artificial Intelligence, Journal Year: 2021, Volume and Issue: 35(2), P. 1148 - 1156

Published: May 18, 2021

Trojan (backdoor) attack is a form of adversarial on deep neural networks where the attacker provides victims with model trained/retrained malicious data. The backdoor can be activated when normal input stamped certain pattern called trigger, causing misclassification. Many existing trojan attacks have their triggers being space patches/objects (e.g., polygon solid color) or simple transformations such as Instagram filters. These are susceptible to recent detection algorithms. We propose novel feature five characteristics: effectiveness, stealthiness, controllability, robustness and reliance features. conduct extensive experiments 9 image classifiers various datasets including ImageNet demonstrate these properties show that our evade state-of-the-art defense.

Language: Английский

Citations

83