MoPe: Model Perturbation based Privacy Attacks on Language Models DOI Creative Commons

Marvin Li,

Jason Wang, Jeffrey Wang

et al.

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Journal Year: 2023, Volume and Issue: unknown, P. 13647 - 13660

Published: Jan. 1, 2023

Recent work has shown that Large Language Models (LLMs) can unintentionally leak sensitive information present in their training data. In this paper, we Model Perturbations (MoPe), a new method to identify with high confidence if given text is the data of pre-trained language model, white-box access models parameters. MoPe adds noise model parameter space and measures drop log-likelihood at point x, statistic show approximates trace Hessian matrix respect Across ranging from 70M 12B parameters, more effective than existing loss-based attacks recently proposed perturbation-based methods. We also examine role order size attack success, empirically demonstrate accurately approximate practice. Our results loss alone insufficient determine extractability—there are points recover using our have average loss. This casts some doubt on prior works use as evidence memorization or unlearning.

Language: Английский

Foundation Models and Fair Use DOI
Peter Henderson, Xuechen Li, Dan Jurafsky

et al.

SSRN Electronic Journal, Journal Year: 2023, Volume and Issue: unknown

Published: Jan. 1, 2023

Existing foundation models are trained on copyrighted material. Deploying these can pose both legal and ethical risks when data creators fail to receive appropriate attribution or compensation. In the United States several other countries, content may be used build without incurring liability due fair use doctrine. However, there is a caveat: If model produces output that similar data, particularly in scenarios affect market of no longer apply model. this work, we emphasize not guaranteed, additional work necessary keep development deployment squarely realm use. First, survey potential developing deploying based content. We review relevant U.S. case law, drawing parallels existing applications for generating text, source code, visual art. Experiments confirm popular generate considerably Second, discuss technical mitigations help stay line with argue more research needed align mitigation strategies current state law. Lastly, suggest law should co-evolve. For example, coupled policy mechanisms, could explicitly consider safe harbors strong tools mitigate infringement harms. This co-evolution strike balance between intellectual property innovation, which speaks original goal But describe here panacea develop policies address harms models.

Language: Английский

Citations

47

State of the Art on Diffusion Models for Visual Computing DOI
Riccardo Pò, Yifan Wang, Vladislav Golyanik

et al.

Computer Graphics Forum, Journal Year: 2024, Volume and Issue: 43(2)

Published: April 30, 2024

Abstract The field of visual computing is rapidly advancing due to the emergence generative artificial intelligence (AI), which unlocks unprecedented capabilities for generation, editing, and reconstruction images, videos, 3D scenes. In these domains, diffusion models are AI architecture choice. Within last year alone, literature on diffusion‐based tools applications has seen exponential growth relevant papers published across computer graphics, vision, communities with new works appearing daily arXiv. This rapid makes it difficult keep up all recent developments. goal this state‐of‐the‐art report (STAR) introduce basic mathematical concepts models, implementation details design choices popular Stable Diffusion model, as well overview important aspects tools, including personalization, conditioning, inversion, among others. Moreover, we give a comprehensive growing generation categorized by type generated medium, 2D objects, locomotion, 4D Finally, discuss available datasets, metrics, open challenges, social implications. STAR provides an intuitive starting point explore exciting topic researchers, artists, practitioners alike.

Language: Английский

Citations

23

Creative encounters of a posthuman kind – anthropocentric law, artificial intelligence, and art DOI
Julija Kalpokienė, Ignas Kalpokas

Technology in Society, Journal Year: 2023, Volume and Issue: 72, P. 102197 - 102197

Published: Jan. 16, 2023

Language: Английский

Citations

20

Creativity and Machine Learning: A Survey DOI Creative Commons
Giorgio Franceschelli, Mirco Musolesi

arXiv (Cornell University), Journal Year: 2021, Volume and Issue: unknown

Published: Jan. 1, 2021

There is a growing interest in the area of machine learning and creativity. This survey presents an overview history state art computational creativity theories, key techniques (including generative deep learning), corresponding automatic evaluation methods. After presenting critical discussion contributions this area, we outline current research challenges emerging opportunities field.

Language: Английский

Citations

15

Generative AI and Content-Creator Economy: Evidence from Online Content Creation Platforms DOI

Hongxian Huang,

Runshan Fu, Anindya Ghose

et al.

SSRN Electronic Journal, Journal Year: 2023, Volume and Issue: unknown

Published: Jan. 1, 2023

Generative Artificial Intelligence (AI) technologies have emerged as a disruptive force in the content-creator economy. In this paper, we aim to understand how integration of AI with online content creation platforms affects its main players: creators. Specifically, leverage quasi-experiments on two leading Asia -- Lofter and Graffiti Kingdom quantify effect adopting or prohibiting tools their Our findings reveal that introducing tool led significant decrease creators' activities platform. Conversely, prohibition produced more nuanced effects. Although overall is not statistically significant, creators who continued using platform after (i.e., non-churners) showed an increase activities. Moreover, demonstrate heightened copyright concerns are inclined adoption Lofter, while popular less likely This paper provides important insights for policymakers platforms.

Language: Английский

Citations

6

MoPe: Model Perturbation based Privacy Attacks on Language Models DOI Creative Commons

Marvin Li,

Jason Wang, Jeffrey Wang

et al.

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Journal Year: 2023, Volume and Issue: unknown, P. 13647 - 13660

Published: Jan. 1, 2023

Recent work has shown that Large Language Models (LLMs) can unintentionally leak sensitive information present in their training data. In this paper, we Model Perturbations (MoPe), a new method to identify with high confidence if given text is the data of pre-trained language model, white-box access models parameters. MoPe adds noise model parameter space and measures drop log-likelihood at point x, statistic show approximates trace Hessian matrix respect Across ranging from 70M 12B parameters, more effective than existing loss-based attacks recently proposed perturbation-based methods. We also examine role order size attack success, empirically demonstrate accurately approximate practice. Our results loss alone insufficient determine extractability—there are points recover using our have average loss. This casts some doubt on prior works use as evidence memorization or unlearning.

Language: Английский

Citations

3