Cited by Efficient Model Stealing Defense with Noise Transition Matrix

Yiming Li, Yong Jiang, Zhifeng Li

и другие.

IEEE Transactions on Neural Networks and Learning Systems, Год журнала: 2022, Номер 35(1), С. 5 - 22

Опубликована: Июнь 22, 2022

Backdoor attack intends to embed hidden backdoors into deep neural networks (DNNs), so that the attacked models perform well on benign samples, whereas their predictions will be maliciously changed if backdoor is activated by attacker-specified triggers. This threat could happen when training process not fully controlled, such as third-party datasets or adopting models, which poses a new and realistic threat. Although learning an emerging rapidly growing research area, there still no comprehensive timely review of it. In this article, we present first survey realm. We summarize categorize existing attacks defenses based characteristics, provide unified framework for analyzing poisoning-based attacks. Besides, also analyze relation between relevant fields (i.e., adversarial data poisoning), widely adopted benchmark datasets. Finally, briefly outline certain future directions relying upon reviewed works. A curated list backdoor-related resources available at https://github.com/THUYimingLi/backdoor-learning-resources .

Язык: Английский

Процитировано

344

A Survey on ChatGPT: AI–Generated Contents, Challenges, and Solutions DOI

Yuntao Wang, Yanghe Pan, Miao Yan

и другие.

IEEE Open Journal of the Computer Society, Год журнала: 2023, Номер 4, С. 280 - 302

Опубликована: Янв. 1, 2023

With the widespread use of large artificial intelligence (AI) models such as ChatGPT, AI-generated content (AIGC) has garnered increasing attention and is leading a paradigm shift in creation knowledge representation. AIGC uses generative AI algorithms to assist or replace humans creating massive, high-quality, human-like at faster pace lower cost, based on user-provided prompts. Despite recent significant progress AIGC, security, privacy, ethical, legal challenges still need be addressed. This paper presents an in-depth survey working principles, security privacy threats, state-of-the-art solutions, future paradigm. Specifically, we first explore enabling technologies, general architecture discuss its modes key characteristics. Then, investigate taxonomy threats highlight ethical societal implications GPT technologies. Furthermore, review watermarking approaches for regulatable paradigms regarding model produced content. Finally, identify open research directions related AIGC.

Язык: Английский

Процитировано

143

Black-Box Dataset Ownership Verification via Backdoor Watermarking DOI

Yiming Li, Mingyan Zhu, Xue Yang

и другие.

IEEE Transactions on Information Forensics and Security, Год журнала: 2023, Номер 18, С. 2318 - 2332

Опубликована: Янв. 1, 2023

Deep learning, especially deep neural networks (DNNs), has been widely and successfully adopted in many critical applications for its high effectiveness efficiency. The rapid development of DNNs benefited from the existence some high-quality datasets ( e.g ., ImageNet), which allow researchers developers to easily verify performance their methods. Currently, almost all existing released require that they can only be academic or educational purposes rather than commercial without permission. However, there is still no good way ensure that. In this paper, we formulate protection as verifying whether are training a (suspicious) third-party model, where defenders query model while having information about parameters details. Based on formulation, propose embed external patterns via backdoor watermarking ownership verification protect them. Our method contains two main parts, including dataset verification. Specifically, exploit poison-only attacks BadNets) design hypothesis-test-guided We also provide theoretical analyses our Experiments multiple benchmark different tasks conducted, method. code reproducing experiments available at https://github.com/THUYimingLi/DVBW.

Язык: Английский

Процитировано

Adversarial Attacks on Large Language Model-Based System and Mitigating Strategies: A Case Study on ChatGPT DOI

Bowen Liu,

Boao Xiao,

Xutong Jiang

и другие.

Security and Communication Networks, Год журнала: 2023, Номер 2023, С. 1 - 10

Опубликована: Июнь 10, 2023

Machine learning algorithms are at the forefront of development advanced information systems. The rapid progress in machine technology has enabled cutting-edge large language models (LLMs), represented by GPT-3 and ChatGPT, to perform a wide range NLP tasks with stunning performance. However, research on adversarial highlights need for these intelligent systems be more robust. Adversarial aims evaluate attack defense mechanisms prevent malicious exploitation In case induction prompt can cause model generate toxic texts that could pose serious security risks or propagate false information. To address this challenge, we first analyze effectiveness inducing attacks ChatGPT. Then, two effective mitigating proposed. is training-free prefix mechanism detect generation texts. second RoBERTa-based identifies manipulative misleading input text via external detection models. availability method demonstrated through experiments.

Язык: Английский

Процитировано

Robust Model Watermarking for Image Processing Networks via Structure Consistency DOI

Jie Zhang, Dongdong Chen, Jing Liao

и другие.

IEEE Transactions on Pattern Analysis and Machine Intelligence, Год журнала: 2024, Номер 46(10), С. 6985 - 6992

Опубликована: Март 25, 2024

The intellectual property of deep networks can be easily "stolen" by surrogate model attack. There has been significant progress in protecting the IP classification tasks. However, little attention devoted to protection image processing models. By utilizing consistent invisible spatial watermarks, work [1] first considered watermarking for and demonstrated its efficacy many downstream Its success depends on hypothesis that if a watermark exists all prediction outputs, will learned into attacker's model. when attacker uses common data augmentation attacks (e.g., rotate, crop, resize) during training, it fail because underlying consistency is destroyed. To mitigate this issue, we propose new methodology, "structure consistency", based which structure-aligned algorithm designed. Specifically, embedded watermarks are designed aligned with physically structures, such as edges or semantic regions. Experiments demonstrate our method more robust than baseline resisting attacks. Besides that, test generalization ability robustness broader range adaptive

Язык: Английский

Процитировано

MEA-Defender: A Robust Watermark against Model Extraction Attack DOI

Peizhuo Lv, Hualong Ma, Chaoyu Chen

и другие.

2022 IEEE Symposium on Security and Privacy (SP), Год журнала: 2024, Номер 2, С. 2515 - 2533

Опубликована: Май 19, 2024

Язык: Английский

Процитировано

Deep neural networks watermark via universal deep hiding and metric learning DOI

Zhicheng Ye,

Xinpeng Zhang, Guorui Feng

и другие.

Neural Computing and Applications, Год журнала: 2024, Номер 36(13), С. 7421 - 7438

Опубликована: Фев. 21, 2024

Язык: Английский

Процитировано

PromptCARE: Prompt Copyright Protection by Watermark Injection and Verification DOI

Hongwei Yao, Jian Lou, Zhan Qin

и другие.

2022 IEEE Symposium on Security and Privacy (SP), Год журнала: 2024, Номер 1, С. 845 - 861

Опубликована: Май 19, 2024

Язык: Английский

Процитировано

Spear or Shield: Mastering the Art of Gen-AI in Face Recognition DOI

Sahil Sharma, Simranjit Singh

Communications in computer and information science, Год журнала: 2025, Номер unknown, С. 392 - 405

Опубликована: Янв. 1, 2025

Язык: Английский

Процитировано

Defending against model extraction attacks with physical unclonable function DOI

Dawei Li, Di Liu, Ying Guo

и другие.

Information Sciences, Год журнала: 2023, Номер 628, С. 196 - 207

Опубликована: Янв. 30, 2023

Язык: Английский

Процитировано