Fine-mixing: Mitigating Backdoors in Fine-tuned Language Models DOI Creative Commons
Zhiyuan Zhang, Lingjuan Lyu, Xingjun Ma

et al.

Published: Jan. 1, 2022

Deep Neural Networks (DNNs) are known to be vulnerable backdoor attacks. In Natural Language Processing (NLP), DNNs often backdoored during the fine-tuning process of a large-scale Pre-trained Model (PLM) with poisoned samples. Although clean weights PLMs readily available, existing methods have ignored this information in defending NLP models against work, we take first step exploit pre-trained (unfine-tuned) mitigate backdoors fine-tuned language models. Specifically, leverage via two complementary techniques: (1) two-step Fine-mixing technique, which mixes (fine-tuned on data) weights, then fine-tunes mixed small subset data; (2) an Embedding Purification (E-PUR) mitigates potential word embeddings. We compare typical mitigation three single-sentence sentiment classification tasks and sentence-pair show that it outperforms baselines by considerable margin all scenarios. also our E-PUR method can benefit methods. Our work establishes simple but strong baseline defense for secure

Language: Английский

Privacy and Robustness in Federated Learning: Attacks and Defenses DOI
Lingjuan Lyu, Han Yu, Xingjun Ma

et al.

IEEE Transactions on Neural Networks and Learning Systems, Journal Year: 2022, Volume and Issue: 35(7), P. 8726 - 8746

Published: Nov. 10, 2022

As data are increasingly being stored in different silos and societies becoming more aware of privacy issues, the traditional centralized training artificial intelligence (AI) models is facing efficiency challenges. Recently, federated learning (FL) has emerged as an alternative solution continues to thrive this new reality. Existing FL protocol designs have been shown be vulnerable adversaries within or outside system, compromising system robustness. Besides powerful global models, it paramount importance design systems that guarantees resistant types adversaries. In article, we conduct a comprehensive survey on robustness over past five years. Through concise introduction concept unique taxonomy covering: 1) threat models; 2) attacks defenses; 3) poisoning defenses, provide accessible review important topic. We highlight intuitions, key techniques, fundamental assumptions adopted by various defenses. Finally, discuss promising future research directions toward robust privacy-preserving FL, their interplays with multidisciplinary goals FL.

Language: Английский

Citations

228

Self-Organizing Key Security Management Algorithm in Socially Aware Networking DOI
Xuemin Zhang,

Deng Haitao,

Zenggang Xiong

et al.

Journal of Signal Processing Systems, Journal Year: 2024, Volume and Issue: 96(6-7), P. 369 - 383

Published: May 27, 2024

Language: Английский

Citations

26

Resource-Constrained and Socially Selfish-Based Incentive Algorithm for Socially Aware Networks DOI
Xuemin Zhang,

Ying Rao,

Zenggang Xiong

et al.

Journal of Signal Processing Systems, Journal Year: 2023, Volume and Issue: 95(12), P. 1439 - 1453

Published: Nov. 7, 2023

Language: Английский

Citations

44

Attention-Enhancing Backdoor Attacks Against BERT-based Models DOI Creative Commons
Weimin Lyu,

Songzhu Zheng,

Lu Pang

et al.

Published: Jan. 1, 2023

Recent studies have revealed that Backdoor Attacks can threaten the safety of natural language processing (NLP) models. Investigating strategies backdoor attacks will help to understand model's vulnerability. Most existing textual focus on generating stealthy triggers or modifying model weights. In this paper, we directly target interior structure neural networks and mechanism. We propose a novel Trojan Attention Loss (TAL), which enhances behavior by manipulating attention patterns. Our loss be applied different attacking methods boost their attack efficacy in terms successful rates poisoning rates. It applies not only traditional dirty-label attacks, but also more challenging clean-label attacks. validate our method backbone models (BERT, RoBERTa, DistilBERT) various tasks (Sentiment Analysis, Toxic Detection, Topic Classification).

Language: Английский

Citations

20

Decouple and Decorrelate: A Disentanglement Security Framework Combining Sample Weighting for Cross-Institution Biased Disease Diagnosis DOI
J G Zhang, Hang Li, Dexuan Xu

et al.

IEEE Internet of Things Journal, Journal Year: 2024, Volume and Issue: 11(15), P. 25543 - 25557

Published: Feb. 19, 2024

There is an urgent need to address the effective diagnosis of multiple diseases across various medical institutions while ensuring privacy data in IoT environments. This requires model have ability zero-shot generalization, which can not be satisfied by existing models. To this issue, we propose a two-stage for image diagnosis, based on decoupling and decorrelating. An adversarial architecture built using gradient reversal discriminator improve model's robustness. further mixed correlation within domain-invariant features achieved disentanglement, mitigate feature dependency through sample weighting. The effectiveness validated both diabetic retinopathy skin lesion datasets. For cross-dataset experiment, select two datasets symmetric reserve remaining dataset as test set. analogous real-world scenarios, where all samples labels are completely unknown model. experiments show that achieves excellent performance outperforms baselines most metrics, demonstrate our approach issue multi-center IoT, with focus enhancing diagnostic accuracy security.

Language: Английский

Citations

8

Toward Stealthy Backdoor Attacks Against Speech Recognition via Elements of Sound DOI
Hanbo Cai, Pengcheng Zhang, Hai Dong

et al.

IEEE Transactions on Information Forensics and Security, Journal Year: 2024, Volume and Issue: 19, P. 5852 - 5866

Published: Jan. 1, 2024

Deep neural networks (DNNs) have been widely and successfully adopted deployed in various applications of speech recognition. Recently, a few works revealed that these models are vulnerable to backdoor attacks, where the adversaries can implant malicious prediction behaviors into victim by poisoning their training process. In this paper, we revisit poison-only attacks against We reveal existing methods not stealthy since trigger patterns perceptible humans or machine detection. This limitation is mostly because simple noises separable distinctive clips. Motivated findings, propose exploit elements sound ( e.g ., pitch timbre) design more yet effective attacks. Specifically, insert short-duration high-pitched signal as increase remaining audio clips 'mask' it for designing pitch-based triggers. manipulate timbre features timbre-based attack voiceprint selection module facilitate multi-backdoor attack. Our generate 'natural' poisoned samples therefore stealthy. Extensive experiments conducted on benchmark datasets, which verify effectiveness our under different settings all-to-one, all-to-all, clean-label, physical, settings) stealthiness. achieve success rates over 95% most cases nearly undetectable. The code reproducing main available at https://github.com/HanboCai/BadSpeech_SoE.

Language: Английский

Citations

8

Backdoor Attacks and Defenses Targeting Multi-Domain AI Models: A Comprehensive Review DOI Open Access
Shaobo Zhang, Yizhen Pan, Qin Liu

et al.

ACM Computing Surveys, Journal Year: 2024, Volume and Issue: 57(4), P. 1 - 35

Published: Nov. 15, 2024

Since the emergence of security concerns in artificial intelligence (AI), there has been significant attention devoted to examination backdoor attacks. Attackers can utilize attacks manipulate model predictions, leading potential harm. However, current research on and defenses both theoretical practical fields still many shortcomings. To systematically analyze these shortcomings address lack comprehensive reviews, this article presents a systematic summary targeting multi-domain AI models. Simultaneously, based design principles shared characteristics triggers different domains implementation stages defense, proposes new classification method for defenses. We use extensively review computer vision natural language processing, we also examine applications audio recognition, video action multimodal tasks, time series generative learning, reinforcement while critically analyzing open problems various attack techniques defense strategies. Finally, builds upon analysis state further explore future directions

Language: Английский

Citations

8

Energy-Based Learning for Preventing Backdoor Attack DOI
Xiangyu Gao, Meikang Qiu

Lecture notes in computer science, Journal Year: 2022, Volume and Issue: unknown, P. 706 - 721

Published: Jan. 1, 2022

Language: Английский

Citations

23

Multi Feature Extraction and Trend Prediction for Weibo Topic Dissemination Network DOI

Zhian Yang,

Hao Jiang,

Lingyue Huang

et al.

Journal of Signal Processing Systems, Journal Year: 2024, Volume and Issue: 96(2), P. 113 - 129

Published: Feb. 1, 2024

Language: Английский

Citations

4

On Model Outsourcing Adaptive Attacks to Deep Learning Backdoor Defenses DOI
Huaibing Peng, Huming Qiu, Hua Ma

et al.

IEEE Transactions on Information Forensics and Security, Journal Year: 2024, Volume and Issue: 19, P. 2356 - 2369

Published: Jan. 1, 2024

Deep learning models with backdoors act maliciously when triggered but seem normal otherwise. This risk, often increased by model outsourcing, challenges their secure use. Although countermeasures exist, defense against adaptive attacks is under-examined, possibly leading to security misjudgments. study the first intricate examination illustrating difficulty of detecting in outsourced models, especially attackers adjust strategies, even if capabilities are significantly limited. It relatively straightforward for circumvent detection trivially violating its threat (e.g., using advanced backdoor types or trigger designs not covered detection). However, this research highlights that various defenses can simultaneously be evaded simple under defined and limited adversary easily detectable triggers while maintaining a high attack success rate). To more specific, introduces novel methodology employs specificity enhancement training regulation symbiotic manner. approach allows us evade multiple simultaneously, including Neural Cleanse (Oakland 19'), ABS (CCS MNTD 21'). These were tools selected Evasive Trojans Track 2022 NeurIPS Trojan Detection Challenge. Even applied conjunction these stringent conditions, such as rate (> 97%) restricted use simplest (small white square), our method garnered second prize Notably, time, successfully other recent state-of-the-art defenses, FeatureRE (NeurIPS 22') Beatrix (NDSS 23'). suggests existing outsourcing remain vulnerable attacks, thus, third-party should avoided whenever possible.

Language: Английский

Citations

3