On using distribution-based compositionality assessment to evaluate compositional generalisation in machine translation DOI Creative Commons
Anssi Moisio, Mathias Creutz, Mikko Kurimo

et al.

Published: Jan. 1, 2023

Compositional generalisation (CG), in NLP and machine learning more generally, has been assessed mostly using artificial datasets. It is important to develop benchmarks assess CG also real-world natural language tasks order understand the abilities limitations of systems deployed wild. To this end, our GenBench Collaborative Benchmarking Task submission utilises distribution-based compositionality assessment (DBCA) framework split Europarl translation corpus into a training test set such way that requires compositional capacity. Specifically, sets have divergent distributions dependency relations, testing NMT systems’ capability translating dependencies they not trained on. This fully-automated procedure create benchmarks, making it simple inexpensive apply further other datasets languages. The code data for experiments available at https://github.com/aalto-speech/dbca.

Language: Английский

Dissociating language and thought in large language models DOI
Kyle Mahowald, Anna A. Ivanova, Idan Blank

et al.

Trends in Cognitive Sciences, Journal Year: 2024, Volume and Issue: 28(6), P. 517 - 540

Published: March 19, 2024

Language: Английский

Citations

122

On the creativity of large language models DOI Creative Commons
Giorgio Franceschelli, Mirco Musolesi

AI & Society, Journal Year: 2024, Volume and Issue: unknown

Published: Nov. 28, 2024

Abstract Large language models (LLMs) are revolutionizing several areas of Artificial Intelligence. One the most remarkable applications is creative writing, e.g., poetry or storytelling: generated outputs often astonishing quality. However, a natural question arises: can LLMs be really considered creative? In this article, we first analyze development under lens creativity theories, investigating key open questions and challenges. particular, focus our discussion on dimensions value, novelty, surprise as proposed by Margaret Boden in her work. Then, consider different classic perspectives, namely product, process, press, person. We discuss set “easy” “hard” problems machine creativity, presenting them relation to LLMs. Finally, examine societal impact these technologies with particular industries, analyzing opportunities offered, challenges arising from them, potential associated risks, both legal ethical points view.

Language: Английский

Citations

18

Analyzing Leakage of Personally Identifiable Information in Language Models DOI
Nils Lukas, Ahmed Salem, Robert B. Sim

et al.

2022 IEEE Symposium on Security and Privacy (SP), Journal Year: 2023, Volume and Issue: unknown, P. 346 - 363

Published: May 1, 2023

Language: Английский

Citations

42

Theory Is All You Need: AI, Human Cognition, and Decision Making DOI
Teppo Felin, Matthias Holweg

SSRN Electronic Journal, Journal Year: 2024, Volume and Issue: unknown

Published: Jan. 1, 2024

Artificial intelligence (AI) now matches or outperforms human in an astonishing array of games, tests, and other cognitive tasks that involve high-level reasoning thinking. Many scholars argue that—due to bias bounded rationality—humans should (or will soon) be replaced by AI situations involving cognition strategic decision making. We disagree. In this paper we first trace the historical origins idea artificial as a form computation information processing. highlight problems with analogy between computers minds input-output devices, using large language models example. Human cognition—in important instances—is better conceptualized theorizing rather than data processing, prediction, even Bayesian updating. Our argument, when it comes cognition, is AI's data-based prediction different from theory-based causal logic. introduce belief-data (a)symmetries difference use "heavier-than-air flight" example our arguments. Theories provide mechanism for identifying new evidence, way "intervening" world, experimenting, problem solving. conclude discussion implications arguments making, including role human-AI hybrids might play process.

Language: Английский

Citations

10

See Widely, Think Wisely: Toward Designing a Generative Multi-agent System to Burst Filter Bubbles DOI Creative Commons
Yu Zhang, Jingwei Sun, Feng Li

et al.

Published: May 11, 2024

The proliferation of AI-powered search and recommendation systems has accelerated the formation "filter bubbles" that reinforce people's biases narrow their perspectives. Previous research attempted to address this issue by increasing diversity information exposure, which is often hindered a lack user motivation engage with. In study, we took human-centered approach explore how Large Language Models (LLMs) could assist users in embracing more diverse We developed prototype featuring LLM-powered multi-agent characters interact with while reading social media content. conducted participatory design study 18 participants found dialogues gamification incentives motivate opposing viewpoints. Additionally, progressive interactions assessment tasks promote thoughtful consideration. Based on these findings, provided implications future work outlooks for leveraging LLMs help burst filter bubbles.

Language: Английский

Citations

8

Theory Is All You Need: AI, Human Cognition, and Causal Reasoning DOI
Teppo Felin, Matthias Holweg

Strategy Science, Journal Year: 2024, Volume and Issue: unknown

Published: Dec. 3, 2024

Scholars argue that artificial intelligence (AI) can generate genuine novelty and new knowledge and, in turn, AI computational models of cognition will replace human decision making under uncertainty. We disagree. AI’s data-based prediction is different from theory-based causal logic reasoning. highlight problems with the decades-old analogy between computers minds as input–output devices, using large language an example. Human better conceptualized a form reasoning rather than emphasis on information processing prediction. uses probability-based approach to largely backward looking imitative, whereas forward-looking capable generating novelty. introduce idea data–belief asymmetries difference cognition, example heavier-than-air flight illustrate our arguments. Theory-based provides cognitive mechanism for humans intervene world engage directed experimentation data. Throughout article, we discuss implications argument understanding origins novelty, knowledge,

Language: Английский

Citations

7

Staying ahead with generative artificial intelligence for learning: navigating challenges and opportunities with 5Ts and 3Rs DOI
Alwyn Vwen Yen Lee

Asia Pacific Journal of Education, Journal Year: 2024, Volume and Issue: 44(1), P. 81 - 93

Published: Jan. 2, 2024

Generative Artificial Intelligence (AI)'s emergence is viewed as a disruptive technological advancement that has been beneficial for most educational purposes but also coupled with emerging challenges and potentially destabilizing effects. Given the unprecedented onset surge in interests, education stakeholders are often pressured to adopt such emergent technologies little space time seek better understanding attain literacy. This paper brings together existing contributions identify list of five common themes (5Ts) various uses generative AI improving students learning future research. The opportunities from use were explored, part rethink how can continue be relevant dynamic environment technologies, three "R" guidelines (3Rs) proposed aid educators stay ahead curve addressing embracing arising learning.

Language: Английский

Citations

5

The structure and statistics of language jointly shape cross-frequency neural dynamics during spoken language comprehension DOI Creative Commons
Hugo Weissbart, Andrea E. Martin

Nature Communications, Journal Year: 2024, Volume and Issue: 15(1)

Published: Oct. 14, 2024

Humans excel at extracting structurally-determined meaning from speech despite inherent physical variability. This study explores the brain's ability to predict and understand spoken language robustly. It investigates relationship between structural statistical knowledge in brain dynamics, focusing on phase amplitude modulation. Using syntactic features constituent hierarchies surface statistics a transformer model as predictors of forward encoding models, we reconstructed cross-frequency neural dynamics MEG data during audiobook listening. Our findings challenge strict separation linguistic structure brain, with both aiding signal reconstruction. Syntactic have more temporally spread impact, word entropy number closing constituents are linked phase-amplitude coupling implying role temporal prediction cortical oscillation alignment processing. results indicate that structured information jointly shape comprehension suggest an integration process via mechanism.

Language: Английский

Citations

5

The Ethical Implications of Generative Audio Models: A Systematic Literature Review DOI Open Access
Julia Barnett

Published: Aug. 8, 2023

Generative audio models typically focus their applications in music and speech generation, with recent having human-like quality output. This paper conducts a systematic literature review of 884 papers the area generative order to both quantify degree which researchers field are considering potential negative impacts identify types ethical implications this need consider. Though 65% research note positive work, less than 10% discuss any impacts. jarringly small percentage impact is particularly worrying because issues brought light by few doing so raising serious concerns relevant broader such as for fraud, deep-fakes, copyright infringement. By quantifying lack consideration identifying key areas harm, lays groundwork future work at critical point time guide more conscientious progresses.

Language: Английский

Citations

12

An Accidental Benchmark: The History, Contingent Power, and Lasting Traces of the GTZAN Dataset DOI Creative Commons
Allison Jerzak

Deleted Journal, Journal Year: 2025, Volume and Issue: 4(2)

Published: May 2, 2025

Abstract In 2002, George Tzanetakis presented a paper on how researchers could automatically classify musical genre from audio signals. Claiming that his model worked as well human classifiers, made dataset available to anyone who asked for it. Ten years later, systematic review found this had circulated massive scale — nearly 25% of papers Music Genre Recognition (MGR) used the so-called GTZAN in their research. Yet an analysis revealed significant problems: repetitions, overrepresentations, files distorted point corruption, with few indicating they ever listened within These warnings went unheeded: remains most widely MGR today. paper, I examine historical and musicological perspective. trace dataset’s introduction into Information Retrieval (MIR) community, show MIR researchers’ tendency view digital object set extracted statistical features, static, query-able combination those created ground truths about music remain embedded our present-day infrastructures. argue tracing history ascendence benchmark status can recover ground-truthing process by early researchers. addition, provides context industry’s shift towards descriptive tagging context-based recommendations.

Language: Английский

Citations

0