Complex Backdoor Detection by Symmetric Feature Differencing DOI
Yingqi Liu,

Guangyu Shen,

Guanhong Tao

и другие.

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Год журнала: 2022, Номер unknown, С. 14983 - 14993

Опубликована: Июнь 1, 2022

Many existing backdoor scanners work by finding a small and fixed trigger. However, advanced attacks have large pervasive triggers, rendering less effective. We develop new detection method. It first uses trigger inversion technique to generate namely, universal input patterns flipping victim class samples target class. then checks if any such is composed of features that are not natural distinctive between the classes. based on novel symmetric feature differencing method identifies separating two sets (e.g., from respective classes). evaluate number including composite attack, reflection hidden filter also traditional patch attack. The evaluation thousands models, both clean trojaned with various architectures. compare three state-of-the-art scanners. Our can achieve 80-88% accuracy while baselines only 50-70% complex attacks. results TrojAI competition rounds 2–4, which backdoors backdoors, show may produce hundreds false positives (i.e., models recognized as trojaned), our removes 78-100% them increase negatives 0-30%, leading 17-41% overall improvement. This allows us top performance leaderboard.

Язык: Английский

Towards Practical Deployment-Stage Backdoor Attack on Deep Neural Networks DOI

Xiangyu Qi,

Tinghao Xie,

Ruizhe Pan

и другие.

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Год журнала: 2022, Номер unknown, С. 13337 - 13347

Опубликована: Июнь 1, 2022

One major goal of the AI security community is to securely and reliably produce deploy deep learning models for real-world applications. To this end, data poisoning based backdoor attacks on neural networks (DNNs) in production stage (or training stage) corresponding defenses are extensively explored recent years. Ironically, deployment stage, which can often happen unprofessional users' devices thus arguably far more threatening scenarios, draw much less attention community. We attribute imbalance vigilance weak practicality existing deployment-stage attack algorithms insufficiency demonstrations. fill blank, work, we study realistic threat DNNs. base our a commonly used paradigm - adversarial weight attack, where adversaries selectively modify model weights embed into deployed approach practicality, propose first gray-box physically realizable algorithm injection, namely subnet replacement (SRA), only requires architecture information victim support physical triggers real world. Extensive experimental simulations system-level real- world demonstrations conducted. Our results not suggest effectiveness proposed algorithm, but also reveal practical risk novel type computer virus that may widely spread stealthily inject DNN user devices. By study, call vulnerability DNNs stage.

Язык: Английский

Процитировано

26

Defending against Backdoor Attacks in Natural Language Generation DOI Open Access
Xiaofei Sun, Xiaoya Li,

Yuxian Meng

и другие.

Proceedings of the AAAI Conference on Artificial Intelligence, Год журнала: 2023, Номер 37(4), С. 5257 - 5265

Опубликована: Июнь 26, 2023

The frustratingly fragile nature of neural network models make current natural language generation (NLG) systems prone to backdoor attacks and generate malicious sequences that could be sexist or offensive. Unfortunately, little effort has been invested how can affect NLG defend against these attacks. In this work, by giving a formal definition attack defense, we investigate problem on two important tasks, machine translation dialog generation. Tailored the inherent (e.g., producing sequence coherent words given contexts), design defending strategies We find testing backward probability generating sources targets yields effective defense performance all different types attacks, is able handle one-to-many issue in many tasks such as hope work raise awareness risks concealed deep inspire more future (both defense) towards direction.

Язык: Английский

Процитировано

16

Backdoor attack and defense in federated generative adversarial network-based medical image synthesis DOI
Ruinan Jin, Xiaoxiao Li

Medical Image Analysis, Год журнала: 2023, Номер 90, С. 102965 - 102965

Опубликована: Сен. 22, 2023

Язык: Английский

Процитировано

16

PTB: Robust physical backdoor attacks against deep neural networks in real world DOI
Mingfu Xue, Can He, Yinghao Wu

и другие.

Computers & Security, Год журнала: 2022, Номер 118, С. 102726 - 102726

Опубликована: Апрель 15, 2022

Язык: Английский

Процитировано

23

Complex Backdoor Detection by Symmetric Feature Differencing DOI
Yingqi Liu,

Guangyu Shen,

Guanhong Tao

и другие.

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Год журнала: 2022, Номер unknown, С. 14983 - 14993

Опубликована: Июнь 1, 2022

Many existing backdoor scanners work by finding a small and fixed trigger. However, advanced attacks have large pervasive triggers, rendering less effective. We develop new detection method. It first uses trigger inversion technique to generate namely, universal input patterns flipping victim class samples target class. then checks if any such is composed of features that are not natural distinctive between the classes. based on novel symmetric feature differencing method identifies separating two sets (e.g., from respective classes). evaluate number including composite attack, reflection hidden filter also traditional patch attack. The evaluation thousands models, both clean trojaned with various architectures. compare three state-of-the-art scanners. Our can achieve 80-88% accuracy while baselines only 50-70% complex attacks. results TrojAI competition rounds 2–4, which backdoors backdoors, show may produce hundreds false positives (i.e., models recognized as trojaned), our removes 78-100% them increase negatives 0-30%, leading 17-41% overall improvement. This allows us top performance leaderboard.

Язык: Английский

Процитировано

22