Cited by Evasion Attack against CNN-Adaboost Electricity Theft Detection in Smart Grid

Dataset Distillation: A Comprehensive Review DOI

Ruonan Yu, Songhua Liu, Xinchao Wang

et al.

IEEE Transactions on Pattern Analysis and Machine Intelligence, Journal Year: 2023, Volume and Issue: 46(1), P. 150 - 170

Published: Oct. 10, 2023

Recent success of deep learning is largely attributed to the sheer amount data used for training neural networks. Despite unprecedented success, massive data, unfortunately, significantly increases burden on storage and transmission further gives rise a cumbersome model process. Besides, relying raw per se yields concerns about privacy copyright. To alleviate these shortcomings, dataset distillation (DD), also known as condensation (DC), was introduced has recently attracted much research attention in community. Given an original dataset, DD aims derive smaller containing synthetic samples, based which trained models yield performance comparable with those dataset. In this paper, we give comprehensive review summary recent advances its application. We first introduce task formally propose overall algorithmic framework followed by all existing methods. Next, provide systematic taxonomy current methodologies area, discuss their theoretical interconnections. present challenges through extensive empirical studies envision possible directions future works.

Language: Английский

Citations

A Comprehensive Survey of Dataset Distillation DOI

Shiye Lei, Dacheng Tao

IEEE Transactions on Pattern Analysis and Machine Intelligence, Journal Year: 2023, Volume and Issue: 46(1), P. 17 - 32

Published: Oct. 6, 2023

Deep learning technology has developed unprecedentedly in the last decade and become primary choice many application domains. This progress is mainly attributed to a systematic collaboration which rapidly growing computing resources encourage advanced algorithms deal with massive data. However, it gradually challenging handle unlimited growth of data limited power. To this end, diverse approaches are proposed improve processing efficiency. Dataset distillation, dataset reduction method, addresses problem by synthesizing small typical from substantial attracted much attention deep community. Existing distillation methods can be taxonomized into meta-learning matching frameworks according whether they explicitly mimic performance target Although shown surprising compressing datasets, there still several limitations such as distilling high-resolution or complex label spaces. paper provides holistic understanding multiple aspects, including algorithms, factorized comparison, applications. Finally, we discuss challenges promising directions further promote future studies on distillation.

Language: Английский

Citations

Adversarial Measurements for Convolutional Neural Network-based Energy Theft Detection Model in Smart Grid DOI

Santosh Nirmal,

Pramod Patil, Sagar Shinde

et al.

e-Prime - Advances in Electrical Engineering Electronics and Energy, Journal Year: 2025, Volume and Issue: unknown, P. 100909 - 100909

Published: Jan. 1, 2025

Language: Английский

Citations

Backdoor Attacks Against Dataset Distillation DOI

Yugeng Liu,

Zheng Li, Michael Backes

et al.

Published: Jan. 1, 2023

Dataset distillation has emerged as a prominent technique to improve data efficiency when training machine learning models.It encapsulates the knowledge from large dataset into smaller synthetic dataset.A model trained on this distilled can attain comparable performance original dataset.However, existing techniques mainly aim at achieving best trade-off between resource usage and utility.The security risks stemming them have not been explored.This study performs first backdoor attack against models by in image domain.Concretely, we inject triggers during procedure rather than stage, where all previous attacks are performed.We propose two types of attacks, namely NAIVEATTACK DOORPING.NAIVEATTACK simply adds raw initial phase, while DOORPING iteratively updates entire procedure.We conduct extensive evaluations multiple datasets, architectures, techniques.Empirical evaluation shows that achieves decent success rate (ASR) scores some cases, reaches higher ASR (close 1.0) cases.Furthermore, comprehensive ablation analyze factors may affect performance.Finally, evaluate defense mechanisms our show practically circumvent these mechanisms.

Language: Английский

Citations

Condensation of Data and Knowledge for Network Traffic Classification: Techniques, Applications, and Open Issues DOI

Zhao Changqing,

Lingxia Liao,

Guomin Chen

et al.

Sensors, Journal Year: 2025, Volume and Issue: 25(8), P. 2368 - 2368

Published: April 8, 2025

The accurate and efficient classification of network traffic, including malicious is essential for effective management, cybersecurity, resource optimization. However, traffic methods in modern, complex, dynamic networks face significant challenges, particularly at the edge, where resources are limited issues such as privacy concerns concept drift arise. Condensation techniques offer a solution by reducing data size, simplifying complex models, transferring knowledge from data. This paper explores condensation methods—such coreset selection, compression, distillation, dataset distillation—within context tasks. It clarifies relationship between these classification, introducing each method its typical applications. also outlines potential scenarios applying technique, highlighting associated challenges open research issues. To best our knowledge, this first comprehensive summary specifically tailored

Language: Английский

Citations

A Survey on Dataset Distillation: Approaches, Applications and Future Directions DOI

Jiahui Geng, Zongxiong Chen, Yuandou Wang

et al.

Published: Aug. 1, 2023

Dataset distillation is attracting more attention in machine learning as training sets continue to grow and the cost of state-of-the-art models becomes increasingly high. By synthesizing datasets with high information density, dataset offers a range potential applications, including support for continual learning, neural architecture search, privacy protection. Despite recent advances, we lack holistic understanding approaches applications. Our survey aims bridge this gap by first proposing taxonomy distillation, characterizing existing approaches, then systematically reviewing data modalities, related In addition, summarize challenges discuss future directions field research.

Language: Английский

Citations

Importance-aware adaptive dataset distillation DOI

Guang Li, Ren Togo, Takahiro Ogawa

et al.

Neural Networks, Journal Year: 2024, Volume and Issue: 172, P. 106154 - 106154

Published: Jan. 29, 2024

Language: Английский

Citations

Subject-Level Membership Inference Attack via Data Augmentation and Model Discrepancy DOI

Yimin Liu, Peng Jiang, Liehuang Zhu

et al.

IEEE Transactions on Information Forensics and Security, Journal Year: 2023, Volume and Issue: 18, P. 5848 - 5859

Published: Jan. 1, 2023

Federated learning (FL) models are vulnerable to membership inference attacks (MIAs), and the requirement of individual privacy motivates protection subjects where data is distributed across multiple users in cross-silo FL setting. In this paper, we propose a subject-level attack based on augmentation model discrepancy. It can effectively infer whether distribution target subject has been sampled used for training by specific federated users, even if other (also) may sample from same use it as part their set. Specifically, adversary uses generative adversarial network (GAN) perform small amount priori federation-associated information known advance. Subsequently, aggregates two different outputs global tested user using an optimal feature construction method. We simulate controlled federation configuration conduct extensive experiments real datasets that include both image categorical data. Results show area under curve (AUC) improved 12.6% 16.8% compared classical attack. This at expense test accuracy augmented with GAN, which most 3.5% lower than also explore degree leakage between overfitted well-generalized setting conclude experimentally former more likely leak degradation rate up 0.43. Finally, present possible defense mechanisms attenuate newly discovered risk.

Language: Английский

Citations

Dataset Distillation by Automatic Training Trajectories DOI

Dai Liu, Jindong Gu, Hu Cao

et al.

Lecture notes in computer science, Journal Year: 2024, Volume and Issue: unknown, P. 334 - 351

Published: Nov. 20, 2024

Language: Английский

Citations

A Comprehensive Survey of Dataset Distillation DOI

Shiye Lei,

Dacheng Tao

arXiv (Cornell University), Journal Year: 2023, Volume and Issue: unknown

Published: Jan. 1, 2023

Language: Английский

Citations