Kaleidoscope: Physical Backdoor Attacks Against Deep Neural Networks With RGB Filters DOI
Xueluan Gong, Ziyao Wang, Yanjiao Chen

et al.

IEEE Transactions on Dependable and Secure Computing, Journal Year: 2023, Volume and Issue: 20(6), P. 4993 - 5004

Published: Jan. 23, 2023

Recent research has shown that deep neural networks are vulnerable to backdoor attacks. A carefully-designed trigger will mislead the victim model misclassify any sample with target label. Nevertheless, existing works usually utilize visible triggers, such as a white square at corner of image, which easily detected by human inspections. Current efforts on developing invisible triggers yield low attack success in physical domain. In this paper, we propose Kaleidoscope, an RGB (red, green, and blue) filter-based method, utilizes filter operations trigger. To enhance rate, design novel model-dependent generation algorithm. We also introduce two constraints loss function make backdoored samples more natural less distorted. Extensive experiments CIFAR-10, CIFAR-100, ImageNette, VGG-Flower have demonstrated filter-processed not only achieve high rate but unnoticeable humans. It is Kaleidoscope can reach than 84% world under different lighting intensities shooting angles. be robust state-of-the-art defenses, spectral signature, STRIP, MNTD.

Language: Английский

Backdoor Learning: A Survey DOI
Yiming Li, Yong Jiang, Zhifeng Li

et al.

IEEE Transactions on Neural Networks and Learning Systems, Journal Year: 2022, Volume and Issue: 35(1), P. 5 - 22

Published: June 22, 2022

Backdoor attack intends to embed hidden backdoors into deep neural networks (DNNs), so that the attacked models perform well on benign samples, whereas their predictions will be maliciously changed if backdoor is activated by attacker-specified triggers. This threat could happen when training process not fully controlled, such as third-party datasets or adopting models, which poses a new and realistic threat. Although learning an emerging rapidly growing research area, there still no comprehensive timely review of it. In this article, we present first survey realm. We summarize categorize existing attacks defenses based characteristics, provide unified framework for analyzing poisoning-based attacks. Besides, also analyze relation between relevant fields (i.e., adversarial data poisoning), widely adopted benchmark datasets. Finally, briefly outline certain future directions relying upon reviewed works. A curated list backdoor-related resources available at https://github.com/THUYimingLi/backdoor-learning-resources .

Language: Английский

Citations

343

Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses DOI
Micah Goldblum,

Dimitris Tsipras,

Chulin Xie

et al.

IEEE Transactions on Pattern Analysis and Machine Intelligence, Journal Year: 2022, Volume and Issue: 45(2), P. 1563 - 1580

Published: March 25, 2022

As machine learning systems grow in scale, so do their training data requirements, forcing practitioners to automate and outsource the curation of order achieve state-of-the-art performance. The absence trustworthy human supervision over collection process exposes organizations security vulnerabilities; can be manipulated control degrade downstream behaviors learned models. goal this work is systematically categorize discuss a wide range dataset vulnerabilities exploits, approaches for defending against these threats, an array open problems space.

Language: Английский

Citations

174

A Comprehensive Survey on Poisoning Attacks and Countermeasures in Machine Learning DOI
Zhiyi Tian,

Lei Cui,

Jie Liang

et al.

ACM Computing Surveys, Journal Year: 2022, Volume and Issue: 55(8), P. 1 - 35

Published: July 30, 2022

The prosperity of machine learning has been accompanied by increasing attacks on the training process. Among them, poisoning have become an emerging threat during model training. Poisoning profound impacts target models, e.g., making them unable to converge or manipulating their prediction results. Moreover, rapid development recent distributed frameworks, especially federated learning, further stimulated attacks. Defending against is challenging and urgent. However, systematic review from a unified perspective remains blank. This survey provides in-depth up-to-date overview corresponding countermeasures in both centralized learning. We firstly categorize attack methods based goals. Secondly, we offer detailed analysis differences connections among techniques. Furthermore, present different framework highlight advantages disadvantages. Finally, discuss reasons for feasibility address potential research directions defenses perspectives, separately.

Language: Английский

Citations

127

Wild Patterns Reloaded: A Survey of Machine Learning Security against Training Data Poisoning DOI Open Access
Antonio Emanuele Cinà, Kathrin Grosse, Ambra Demontis

et al.

ACM Computing Surveys, Journal Year: 2023, Volume and Issue: 55(13s), P. 1 - 39

Published: March 1, 2023

The success of machine learning is fueled by the increasing availability computing power and large training datasets. data used to learn new models or update existing ones, assuming that it sufficiently representative will be encountered at test time. This assumption challenged threat poisoning, an attack manipulates compromise model’s performance Although poisoning has been acknowledged as a relevant in industry applications, variety different attacks defenses have proposed so far, complete systematization critical review field still missing. In this survey, we provide comprehensive learning, reviewing more than 100 papers published past 15 years. We start categorizing current then organize accordingly. While focus mostly on computer-vision argue our also encompasses state-of-the-art for other modalities. Finally, discuss resources research shed light limitations open questions field.

Language: Английский

Citations

68

Data and Model Poisoning Backdoor Attacks on Wireless Federated Learning, and the Defense Mechanisms: A Comprehensive Survey DOI
Yichen Wan, Youyang Qu, Wei Ni

et al.

IEEE Communications Surveys & Tutorials, Journal Year: 2024, Volume and Issue: 26(3), P. 1861 - 1897

Published: Jan. 1, 2024

Due to the greatly improved capabilities of devices, massive data, and increasing concern about data privacy, Federated Learning (FL) has been increasingly considered for applications wireless communication networks (WCNs). Wireless FL (WFL) is a distributed method training global deep learning model in which large number participants each train local on their datasets then upload updates central server. However, general, nonindependent identically (non-IID) WCNs raises concerns robustness, as malicious participant could potentially inject "backdoor" into by uploading poisoned or models over WCN. This cause misclassify inputs specific target class while behaving normally with benign inputs. survey provides comprehensive review latest backdoor attacks defense mechanisms. It classifies them according targets (data poisoning poisoning), attack phase (local collection, training, aggregation), stage before aggregation, during after aggregation). The strengths limitations existing strategies mechanisms are analyzed detail. Comparisons methods designs carried out, pointing noteworthy findings, open challenges, potential future research directions related security privacy WFL.

Language: Английский

Citations

22

Rethinking the Trigger of Backdoor Attack DOI Creative Commons
Yiming Li,

Tongqing Zhai,

Baoyuan Wu

et al.

arXiv (Cornell University), Journal Year: 2020, Volume and Issue: unknown

Published: Jan. 1, 2020

Backdoor attack intends to inject hidden backdoor into the deep neural networks (DNNs), such that prediction of infected model will be maliciously changed if is activated by attacker-defined trigger, while it performs well on benign samples. Currently, most existing attacks adopted setting \emph{static} $i.e.,$ triggers across training and testing images follow same appearance are located in area. In this paper, we revisit paradigm analyzing characteristics static trigger. We demonstrate an vulnerable when trigger not consistent with one used for training. further explore how utilize property defense, discuss alleviate vulnerability attacks.

Language: Английский

Citations

98

Poison Ink: Robust and Invisible Backdoor Attack DOI
Jie Zhang, Dongdong Chen, Qidong Huang

et al.

IEEE Transactions on Image Processing, Journal Year: 2022, Volume and Issue: 31, P. 5691 - 5705

Published: Jan. 1, 2022

Recent research shows deep neural networks are vulnerable to different types of attacks, such as adversarial attack, data poisoning attack and backdoor attack. Among them, is the most cunning one can occur in almost every stage learning pipeline. Therefore, has attracted lots interests from both academia industry. However, existing methods either visible or fragile some effortless pre-processing common transformations. To address these limitations, we propose a robust invisible called "Poison Ink". Concretely, first leverage image structures target areas, fill them with poison ink (information) generate trigger pattern. As structure keep its semantic meaning during transformation, pattern inherently Then injection network embed into cover achieve stealthiness. Compared popular methods, Poison Ink outperforms stealthiness robustness. Through extensive experiments, demonstrate not only general datasets architectures, but also flexible for scenarios. Besides, it very strong resistance against many state-of-the-art defense techniques.

Language: Английский

Citations

60

The "Beatrix" Resurrections: Robust Backdoor Detection via Gram Matrices DOI Open Access
Wanlun Ma, Derui Wang, Ruoxi Sun

et al.

Published: Jan. 1, 2023

Language: Английский

Citations

32

Backdoor Pre-trained Models Can Transfer to All DOI

Lujia Shen,

Shouling Ji, Xuhong Zhang

et al.

Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, Journal Year: 2021, Volume and Issue: unknown, P. 3141 - 3158

Published: Nov. 12, 2021

Pre-trained general-purpose language models have been a dominating component in enabling real-world natural processing (NLP) applications. However, pre-trained model with backdoor can be severe threat to the Most existing attacks NLP are conducted fine-tuning phase by introducing malicious triggers targeted class, thus relying greatly on prior knowledge of task. In this paper, we propose new approach map inputs containing directly predefined output representation models, e.g., for classification token BERT, instead target label. It introduce wide range downstream tasks without any knowledge. Additionally, light unique properties NLP, two metrics measure performance terms both effectiveness and stealthiness. Our experiments various types show that our method is widely applicable different (classification named entity recognition) (such as XLNet, BART), which poses threat. Furthermore, collaborating popular online repository Hugging Face, brought has confirmed. Finally, analyze factors may affect attack share insights causes success attack.

Language: Английский

Citations

53

Design and Evaluation of a Multi-Domain Trojan Detection Method on Deep Neural Networks DOI
Yansong Gao, Yeonjae Kim, Bao Gia Doan

et al.

IEEE Transactions on Dependable and Secure Computing, Journal Year: 2021, Volume and Issue: 19(4), P. 2349 - 2364

Published: Feb. 2, 2021

Trojan attacks on deep neural networks (DNNs) exploit a backdoor embedded in DNN model that can hijack any input with an attacker's chosen signature trigger. Emerging defence mechanisms are mainly designed and validated vision domain tasks (e.g., image classification) 2D Convolutional Neural Network (CNN) architectures; mechanism is general across vision, text, audio demanded. This work designs evaluates run-time detection method exploiting STR ong xmlns:xlink="http://www.w3.org/1999/xlink">I ntentional xmlns:xlink="http://www.w3.org/1999/xlink">P erturbation of inputs multi-domain input-agnostic xmlns:xlink="http://www.w3.org/1999/xlink">Vi sion, xmlns:xlink="http://www.w3.org/1999/xlink">T ext xmlns:xlink="http://www.w3.org/1999/xlink">A udio domains—thus termed as STRIP-ViTA. Specifically, STRIP-ViTA demonstratively independent not only task but also architectures. Most importantly, unlike other mechanisms, it requires neither machine learning expertise nor expensive computational resource, which the reason behind outsourcing scenario—one main attack surface attack. We have extensively evaluated performance over: i) CIFAR10 GTSRB datasets using CNNs for tasks; ii) IMDB consumer complaint both LSTM 1D text iii) speech command dataset tasks. Experimental results based more than 30 tested Trojaned models (including publicly model) corroborate performs well all nine architectures five datasets. Overall, effectively detect trigger small false acceptance rate (FAR) acceptable preset rejection (FRR). In particular, tasks, we always achieve 0 percent FRR FAR given strong success preferred by attacker. By setting to be 3 percent, average 1.1 3.55 achieved respectively. Moreover, against number advanced backdoor compare its effectiveness recent state-of-the-arts.

Language: Английский

Citations

47