NLMs: Augmenting Negation in Language Models DOI Creative Commons
Rituraj Singh, Rahul Kumar,

Vivek Kumar Rangarajan Sridhar

et al.

Published: Jan. 1, 2023

Negation is the fundamental component in a natural language that reverses semantic meaning of sentence. It plays an extremely important role across wide range applications, yet they are underrepresented pre-trained models (LMs), resulting often wrong inferences. In this work, we try to improve underlying understanding negation LMs. To augment understanding, propose model objective with weighted cross-entropy loss and elastic weight consolidation regularization. We reduce mean top 1 error rate for BERT-base 1.1%, BERT-large 0.78%, RoBERTA-base 3.74%, RoBERTA-large 0.01% on negated LAMA dataset. minimizes BERT by margin 8% also outperform existing models. provide empirical evidences augmented classical original as well benchmarks inference tasks.

Language: Английский

How Much Do Language Models Copy From Their Training Data? Evaluating Linguistic Novelty in Text Generation Using RAVEN DOI Creative Commons
R. Thomas McCoy, Paul Smolensky, Tal Linzen

et al.

Transactions of the Association for Computational Linguistics, Journal Year: 2023, Volume and Issue: 11, P. 652 - 670

Published: Jan. 1, 2023

Abstract Current language models can generate high-quality text. Are they simply copying text have seen before, or learned generalizable linguistic abstractions? To tease apart these possibilities, we introduce RAVEN, a suite of analyses for assessing the novelty generated text, focusing on sequential structure (n-grams) and syntactic structure. We apply to four neural trained English (an LSTM, Transformer, Transformer-XL, GPT-2). For local structure—e.g., individual dependencies—text with standard sampling scheme is substantially less novel than our baseline human-generated from each model’s test set. larger-scale overall sentence structure—model-generated as even more baseline, but still sometimes copy substantially, in some cases duplicating passages over 1,000 words long training also perform extensive manual analysis, finding evidence that GPT-2 uses both compositional analogical generalization mechanisms showing GPT-2’s usually well-formed morphologically syntactically has reasonably frequent semantic issues (e.g., being self-contradictory).

Language: Английский

Citations

25

Machine Learning Assisted Hit Prioritization for High Throughput Screening in Drug Discovery DOI Creative Commons
Davide Boldini, Lukas Friedrich, Daniel Kühn

et al.

ACS Central Science, Journal Year: 2024, Volume and Issue: unknown

Published: March 15, 2024

Efficient prioritization of bioactive compounds from high throughput screening campaigns is a fundamental challenge for accelerating drug development efforts. In this study, we present the first data-driven approach to simultaneously detect assay interferents and prioritize true compounds. By analyzing learning dynamics during training gradient boosting model on noisy data using novel formulation sample influence, are able distinguish between exhibiting desired biological response those producing artifacts. Therefore, our method enables false positive detection without relying prior screens or interference mechanisms, making it applicable any campaign. We demonstrate that consistently excludes with different mechanisms prioritizes biologically relevant more efficiently than all tested baselines, including retrospective case study simulating its use in real discovery Finally, tool extremely computationally efficient, requiring less 30 s per low-resource hardware. As such, findings show an ideal addition existing tools can be used guide further pharmacological optimization after campaigns.

Language: Английский

Citations

11

Bias and Unfairness in Information Retrieval Systems: New Challenges in the LLM Era DOI Creative Commons
Sunhao Dai, Xu Chen, Shicheng Xu

et al.

Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Journal Year: 2024, Volume and Issue: unknown, P. 6437 - 6447

Published: Aug. 24, 2024

With the rapid advancements of large language models (LLMs), information retrieval (IR) systems, such as search engines and recommender have undergone a significant paradigm shift. This evolution, while heralding new opportunities, introduces emerging challenges, particularly in terms biases unfairness, which may threaten ecosystem. In this paper, we present comprehensive survey existing works on pressing bias unfairness issues IR systems when integration LLMs. We first unify distribution mismatch problems, providing groundwork for categorizing various mitigation strategies through alignment. Subsequently, systematically delve into specific arising from three critical stages LLMs systems: data collection, model development, result evaluation. doing so, meticulously review analyze recent literature, focusing definitions, characteristics, corresponding associated with these issues. Finally, identify highlight some open problems challenges future work, aiming to inspire researchers stakeholders field beyond better understand mitigate LLM era. also consistently maintain GitHub repository relevant papers resources rising direction at https://github.com/KID-22/LLM-IR-Bias-Fairness-Survey.

Language: Английский

Citations

11

Using rhetorical strategies to design prompts: a human-in-the-loop approach to make AI useful DOI Creative Commons
Nupoor Ranade,

Marly Saravia,

Aditya Johri

et al.

AI & Society, Journal Year: 2024, Volume and Issue: unknown

Published: April 1, 2024

Abstract The growing capabilities of artificial intelligence (AI) word processing models have demonstrated exceptional potential to impact language related tasks and functions. Their fast pace adoption probable effect has also given rise controversy within certain fields. Models, such as GPT-3, are a particular concern for professionals engaged in writing, particularly their engagement with these technologies is limited due lack ability control output. Most efforts maximize output rely on process known prompt engineering, the construction modification inputted expectation outputted or desired text. Consequently, engineering emerged an important consideration research practice. Previous conceptions largely focused technical logistic modifications back-end processing, remaining inaccessible and, still, most users. In this paper, we look communication field its methods text generation—the rhetorical situation—to conceptualize more comprehensible way users by considering context rhetoric. We introduce framework, consisting formula, which demands all components situation be present prompt. discussions future AI writing use both professional educational settings. Ultimately, discussion findings aim provide means integrating agency writer-centric tools advance human-in-the-loop approach. As generative especially NLP-based become common across societal functions, will play crucial role not just technology, but productive responsible use.

Language: Английский

Citations

7

UKRAG: A Unified Knowledge Graph to Enhance Retrieval Augmented Generation Performance DOI

Akram Alkouz,

Mohammed I. Al-Saleh, Abdulsalam Alarabeyyat

et al.

Communications in computer and information science, Journal Year: 2025, Volume and Issue: unknown, P. 1 - 19

Published: Jan. 1, 2025

Language: Английский

Citations

0

Derivational morphology reveals analogical generalization in large language models DOI Creative Commons
Valentin Hofmann,

Leonie Weissweiler,

David R. Mortensen

et al.

Proceedings of the National Academy of Sciences, Journal Year: 2025, Volume and Issue: 122(19)

Published: May 9, 2025

What mechanisms underlie linguistic generalization in large language models (LLMs)? This question has attracted considerable attention, with most studies analyzing the extent to which skills of LLMs resemble rules. As yet, it is not known whether could equally well be explained as result analogy. A key shortcoming prior research its focus on regular phenomena, for rule-based and analogical approaches make same predictions. Here, we instead examine derivational morphology, specifically English adjective nominalization, displays notable variability. We introduce a method investigating LLMs: Focusing GPT-J, fit cognitive that instantiate learning LLM training data compare their predictions set nonce adjectives those LLM, allowing us draw direct conclusions regarding underlying mechanisms. expected, explain GPT-J nominalization patterns. However, variable patterns, model provides much better match. Furthermore, GPT-J’s behavior sensitive individual word frequencies, even forms, consistent an account but one. These findings refute hypothesis involves rules, suggesting analogy mechanism. Overall, our study suggests processes play bigger role than previously thought.

Language: Английский

Citations

0

Knowledge-Empowered, Collaborative, and Co-Evolving AI Models: The Post-LLM Roadmap DOI Creative Commons
Fei Wu, Tao Shen, Thomas Bäck

et al.

Engineering, Journal Year: 2024, Volume and Issue: 44, P. 87 - 100

Published: Dec. 19, 2024

Language: Английский

Citations

3

A chat about actinic keratosis: Examining capabilities and user experience of ChatGPT as a digital health technology in dermato‐oncology DOI Creative Commons

Heather Lent,

Vinzent Kevin Ortner, Katrine Karmisholt

et al.

JEADV Clinical Practice, Journal Year: 2023, Volume and Issue: 3(1), P. 258 - 265

Published: Oct. 27, 2023

Abstract Background The potential applications of artificial intelligence (AI) in dermatology are evolving rapidly. Chatbots an emerging trend healthcare that rely on large language models (LLMs) to generate answers prompts from users. However, the factuality and user experience (UX) such chatbots remain be evaluated context dermato‐oncology. Objectives To examine Chat Generative Pretrained Transformer (ChatGPT) as a reliable source information actinic keratosis (AK) evaluate clinicians' attitudes UX with regard chatbot. Methods A set 38 clinical questions were compiled entered natural queries separate, individual conversation threads ChatGPT (OpenAI, default GPT 3.5). Questions pertain patient education, diagnosis, treatment. ChatGPT's responses presented panel 7 dermatologists for rating factual accuracy, currency information, completeness response. Attitudes towards ChatGTP explored qualitatively quantitatively using validated questionnaire (UEQ). Results answered 12 (31.6%) accurate, current, complete performed best including pathogenesis AK risk factors, but struggled diagnosis Major deficits seen grading AK, providing up‐to‐date treatment guidance, asserting incorrect unwarranted confidence. Further, considered verbose average word count 198 (SD 55) overly alarming malignant transformation. Based UEQ responses, expert attractive efficient tool, scoring highest speed retrieval, deemed chatbot inaccurate verbose, lowest clarity . Conclusions While rated high UX, underlying LLMs enable require further development guarantee accuracy concision required setting.

Language: Английский

Citations

7

Evaluating Data Attribution for Text-to-Image Models DOI
Sheng-Yu Wang, Alexei A. Efros, Jun-Yan Zhu

et al.

2021 IEEE/CVF International Conference on Computer Vision (ICCV), Journal Year: 2023, Volume and Issue: unknown, P. 7158 - 7169

Published: Oct. 1, 2023

While large text-to-image models are able to synthesize "novel" images, these images necessarily a reflection of the training data. The problem data attribution in such – which set most responsible for appearance given generated image is difficult yet important one. As an initial step toward this problem, we evaluate through "customization" methods, tune existing large-scale model exemplar object or style. Our key insight that allow us efficiently create synthetic computationally influenced by construction. With our new dataset exemplar-influenced various algorithms and different possible feature spaces. Furthermore, on dataset, can standard models, as DINO, CLIP, ViT, problem. Even though procedure tuned towards small sets, show generalization larger sets. Finally, taking into account inherent uncertainty assign soft scores over images.

Language: Английский

Citations

6

Impact of Co-occurrence on Factual Knowledge of Large Language Models DOI Creative Commons
Cheongwoong Kang, Jaesik Choi

Published: Jan. 1, 2023

Large language models (LLMs) often make factually incorrect responses despite their success in various applications. In this paper, we hypothesize that relying heavily on simple co-occurrence statistics of the pre-training corpora is one main factors cause factual errors. Our results reveal LLMs are vulnerable to bias, defined as preferring frequently co-occurred words over correct answer. Consequently, struggle recall facts whose subject and object rarely co-occur dataset although they seen during finetuning. We show bias remains scaling up model sizes or Therefore, suggest finetuning a debiased mitigate by filtering out biased samples subject-object count high. Although allows memorize rare training set, it not effective recalling unseen Further research mitigation will help build reliable preventing potential The code available at https://github.com/CheongWoong/impact_of_cooccurrence.

Language: Английский

Citations

4