Cited by The Thousand Faces of Explainable AI Along the Machine Learning Life Cycle: Industrial Reality and Current State of Research

Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses DOI

Micah Goldblum,

Dimitris Tsipras,

Chulin Xie

et al.

IEEE Transactions on Pattern Analysis and Machine Intelligence, Journal Year: 2022, Volume and Issue: 45(2), P. 1563 - 1580

Published: March 25, 2022

As machine learning systems grow in scale, so do their training data requirements, forcing practitioners to automate and outsource the curation of order achieve state-of-the-art performance. The absence trustworthy human supervision over collection process exposes organizations security vulnerabilities; can be manipulated control degrade downstream behaviors learned models. goal this work is systematically categorize discuss a wide range dataset vulnerabilities exploits, approaches for defending against these threats, an array open problems space.

Language: Английский

Citations

173

Foundation Models and Fair Use DOI

Peter Henderson, Xuechen Li, Dan Jurafsky

et al.

SSRN Electronic Journal, Journal Year: 2023, Volume and Issue: unknown

Published: Jan. 1, 2023

Existing foundation models are trained on copyrighted material. Deploying these can pose both legal and ethical risks when data creators fail to receive appropriate attribution or compensation. In the United States several other countries, content may be used build without incurring liability due fair use doctrine. However, there is a caveat: If model produces output that similar data, particularly in scenarios affect market of no longer apply model. this work, we emphasize not guaranteed, additional work necessary keep development deployment squarely realm use. First, survey potential developing deploying based content. We review relevant U.S. case law, drawing parallels existing applications for generating text, source code, visual art. Experiments confirm popular generate considerably Second, discuss technical mitigations help stay line with argue more research needed align mitigation strategies current state law. Lastly, suggest law should co-evolve. For example, coupled policy mechanisms, could explicitly consider safe harbors strong tools mitigate infringement harms. This co-evolution strike balance between intellectual property innovation, which speaks original goal But describe here panacea develop policies address harms models.

Language: Английский

Citations

Leveraging explanations in interactive machine learning: An overview DOI

Stefano Teso,

Öznur Alkan,

Wolfgang Stammer

et al.

Frontiers in Artificial Intelligence, Journal Year: 2023, Volume and Issue: 6

Published: Feb. 23, 2023

Explanations have gained an increasing level of interest in the AI and Machine Learning (ML) communities order to improve model transparency allow users form a mental trained ML model. However, explanations can go beyond this one way communication as mechanism elicit user control, because once understand, they then provide feedback. The goal paper is present overview research where are combined with interactive capabilities mean learn new models from scratch edit debug existing ones. To end, we draw conceptual map state-of-the-art, grouping relevant approaches based on their intended purpose how structure interaction, highlighting similarities differences between them. We also discuss open issues outline possible directions forward, hope spurring further blooming topic.

Language: Английский

Citations

Machine Unlearning of Features and Labels DOI

Alexander Warnecke, Lukas Pirch, Christian Wressnegger

et al.

Published: Jan. 1, 2023

Removing information from a machine learning model is non-trivial task that requires to partially revert the training process.This unavoidable when sensitive data, such as credit card numbers or passwords, accidentally enter and need be removed afterwards.Recently, different concepts for unlearning have been proposed address this problem.While these approaches are effective in removing individual data points, they do not scale scenarios where larger groups of features labels reverted.In paper, we propose first method labels.Our approach builds on concept influence functions realizes through closed-form updates parameters.It enables adapt retrospectively, thereby correcting leaks privacy issues.For models with strongly convex loss functions, our provides certified theoretical guarantees.For non-convex losses, empirically show significantly faster than other strategies.

Language: Английский

Citations

FastIF: Scalable Influence Functions for Efficient Model Interpretation and Debugging DOI

Han Guo, Nazneen Fatema Rajani,

Peter Hase

et al.

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Journal Year: 2021, Volume and Issue: unknown

Published: Jan. 1, 2021

Influence functions approximate the “influences” of training data-points for test predictions and have a wide variety applications. Despite popularity, their computational cost does not scale well with model data size. We present FastIF, set simple modifications to influence that significantly improves run-time. use k-Nearest Neighbors (kNN) narrow search space down subset good candidate points, identify configurations best balance speed-quality trade-off in estimating inverse Hessian-vector product, introduce fast parallel variant. Our proposed method achieves about 80X speedup while being highly correlated original values. With availability functions, we demonstrate usefulness four First, examine whether influential can “explain” time behavior using framework simulatability. Second, visualize interactions between data-points. Third, show correct errors by additional fine-tuning on certain data-points, improving accuracy trained MultiNLI 2.5% HANS dataset. Finally, experiment similar setup but datapoints seen during training, 2.8% 1.7% ANLI datasets respectively. Overall, our be efficiently applied large models datasets, experiments potential interpretation correcting errors.

Language: Английский

Citations

Combining Feature and Instance Attribution to Detect Artifacts DOI

Pouya Pezeshkpour, Sarthak Jain, Sameer Singh

et al.

Findings of the Association for Computational Linguistics: ACL 2022, Journal Year: 2022, Volume and Issue: unknown, P. 1934 - 1946

Published: Jan. 1, 2022

Training the deep neural networks that dominate NLP requires large datasets. These are often collected automatically or via crowdsourcing, and may exhibit systematic biases annotation artifacts. By latter we mean spurious correlations between inputs outputs do not represent a generally held causal relationship features classes; models exploit such appear to perform given task well, but fail on out of sample data. In this paper, evaluate use different attribution methods for aiding identification training data We propose new hybrid approaches combine saliency maps (which highlight important input features) with instance retrieve samples influential prediction). show proposed training-feature can be used efficiently uncover artifacts in when challenging validation set is available. also carry small user study whether these useful researchers practice, promising results. make code all experiments paper

Language: Английский

Citations

A Comprehensive Study of Image Classification Model Sensitivity to Foregrounds, Backgrounds, and Visual Attributes DOI

Mazda Moayeri,

Phillip E. Pope,

Yogesh Balaji

et al.

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Journal Year: 2022, Volume and Issue: unknown, P. 19065 - 19075

Published: June 1, 2022

While datasets with single-label supervision have propelled rapid advances in image classification, additional annotations are necessary order to quantitatively assess how models make predictions. To this end, for a subset of ImageNet samples, we collect segmentation masks the entire object and 18 informative attributes. We call dataset RIVAL10 (RIch Visual Attributes Localization), consisting roughly 26k instances over 10 classes. Using RIVAL10, evaluate sensitivity broad set noise corruptions foregrounds, backgrounds In our analysis, consider diverse state-of-the-art architectures (ResNets, Transformers) training procedures (CLIP, SimCLR, DeiT, Adversarial Training). find that, somewhat surprisingly, ResNets, adversarial makes more sensitive background compared foreground than standard training. Similarly, contrastively-trained also lower relative both transformers ResNets. Lastly, observe intriguing adaptive abilities increase as corruption level increases. saliency methods, automatically discover spurious features that drive alignment maps foregrounds. Finally, study attribution problem neural by comparing feature ground-truth localization semantic

Language: Английский

Citations

GIF: A General Graph Unlearning Strategy via Influence Function DOI

Jiancan Wu, Yi Yang, Yuchun Qian

et al.

Proceedings of the ACM Web Conference 2022, Journal Year: 2023, Volume and Issue: unknown, P. 651 - 661

Published: April 26, 2023

With the greater emphasis on privacy and security in our society, problem of graph unlearning -- revoking influence specific data trained GNN model, is drawing increasing attention. However, ranging from machine to recently emerged methods, existing efforts either resort retraining paradigm, or perform approximate erasure that fails consider inter-dependency between connected neighbors imposes constraints structure, therefore hard achieve satisfying performance-complexity trade-offs. In this work, we explore function tailored for unlearning, so as improve efficacy efficiency unlearning. We first present a unified formulation diverse tasks \wrt node, edge, feature. Then, recognize crux inability traditional devise Graph Influence Function (GIF), model-agnostic method can efficiently accurately estimate parameter changes response $\epsilon$-mass perturbation deleted data. The idea supplement objective with an additional loss term influenced due structural dependency. Further deductions closed-form solution provide better understanding mechanism. conduct extensive experiments four representative models three benchmark datasets justify superiority GIF terms efficacy, model utility, efficiency. Our implementations are available at \url{https://github.com/wujcan/GIF-torch/}.

Language: Английский

Citations

Influential Exemplar Replay for Incremental Learning in Recommender Systems DOI

Xinni Zhang,

Yankai Chen,

Chenhao Ma

et al.

Proceedings of the AAAI Conference on Artificial Intelligence, Journal Year: 2024, Volume and Issue: 38(8), P. 9368 - 9376

Published: March 24, 2024

Personalized recommender systems have found widespread applications for effective information filtering. Conventional models engage in knowledge mining within the static setting to reconstruct singular historical data. Nonetheless, dynamics of real-world environments are a constant state flux, rendering acquired model inadequate accommodating emergent trends and thus leading notable recommendation performance decline. Given typically prohibitive cost exhaustive retraining, it has emerged study incremental learning with ever-growing In this paper, we propose an model-agnostic framework, namely INFluential Exemplar Replay (INFER). INFER facilitates retaining earlier assimilated knowledge, e.g., users' enduring preferences, while concurrently evolving manifested new interaction behaviors. We commence vanilla implementation that centers on identifying most representative data samples consolidation early knowledge. Subsequently, advanced solution, INFERONCE, optimize computational overhead associated implementation. Extensive experiments four prototypical backbone models, two classic tasks, widely used benchmarks consistently demonstrate effectiveness our method as well its compatibility extending several models.

Language: Английский

Citations

Towards Tracing Knowledge in Language Models Back to the Training Data DOI

Ekin Akyürek,

Tolga Bolukbasi,

Frederick Liu

et al.

Published: Jan. 1, 2022

Language models (LMs) have been shown to memorize a great deal of factual knowledge contained in their training data. But when an LM generates assertion, it is often difficult determine where learned this information and whether true. In paper, we propose the problem fact tracing: identifying which examples taught generate particular assertion. Prior work on data attribution (TDA) may offer effective tools for such examples, known as “proponents”. We present first quantitative benchmark evaluate this. compare two popular families TDA methods — gradient-based embedding-based find that much headroom remains. For example, both lower proponent-retrieval precision than retrieval baseline (BM25) does not access at all. identify key challenges be necessary further improvement overcoming gradient saturation, also show how several nuanced implementation details existing neural can significantly improve overall tracing performance.

Language: Английский

Citations