Durably reducing conspiracy beliefs through dialogues with AI DOI Open Access
Thomas H. Costello, Gordon Pennycook, David G. Rand

et al.

Published: April 3, 2024

Conspiracy theories are a paradigmatic example of beliefs that, once adopted, extremely difficult to dispel. Influential psychological propose that conspiracy uniquely resistant counterevidence because they satisfy important needs and motivations. Here, we raise the possibility previous attempts correct have been unsuccessful merely failed deliver was sufficiently compelling tailored each believer’s specific theory (which vary dramatically from believer believer). To evaluate this possibility, leverage recent developments in generative artificial intelligence (AI) well-argued, person-specific debunks total N = 2,190 believers. Participants our experiments provided detailed, open-ended explanations believed, then engaged 3 round dialogue with frontier AI model (GPT-4 Turbo) which instructed reduce participant’s belief their (or discuss banal topic control condition). Across two experiments, find robust evidence debunking conversation reduced by roughly 20%. This effect did not decay over 2 months time, consistently observed across wide range different theories, occurred even for participants whose were deeply entrenched great importance identities. Furthermore, although dialogues focused on single theory, intervention spilled unrelated conspiracies, indicating general decrease conspiratorial worldview, as well increasing intentions challenge others who espouse chosen conspiracy. These findings highlight many people strongly believe seemingly fact-resistant can change minds face sufficient evidence.

Language: Английский

A survey on large language model based autonomous agents DOI Creative Commons
Lei Wang, Chen Ma, Xueyang Feng

et al.

Frontiers of Computer Science, Journal Year: 2024, Volume and Issue: 18(6)

Published: March 22, 2024

Abstract Autonomous agents have long been a research focus in academic and industry communities. Previous often focuses on training with limited knowledge within isolated environments, which diverges significantly from human learning processes, makes the hard to achieve human-like decisions. Recently, through acquisition of vast amounts Web knowledge, large language models (LLMs) shown potential human-level intelligence, leading surge LLM-based autonomous agents. In this paper, we present comprehensive survey these studies, delivering systematic review holistic perspective. We first discuss construction agents, proposing unified framework that encompasses much previous work. Then, overview diverse applications social science, natural engineering. Finally, delve into evaluation strategies commonly used for Based also several challenges future directions field.

Language: Английский

Citations

225

A survey of GPT-3 family large language models including ChatGPT and GPT-4 DOI Creative Commons

Katikapalli Subramanyam Kalyan

Natural Language Processing Journal, Journal Year: 2023, Volume and Issue: 6, P. 100048 - 100048

Published: Dec. 19, 2023

Large language models (LLMs) are a special class of pretrained (PLMs) obtained by scaling model size, pretraining corpus and computation. LLMs, because their large size on volumes text data, exhibit abilities which allow them to achieve remarkable performances without any task-specific training in many the natural processing tasks. The era LLMs started with OpenAI's GPT-3 model, popularity has increased exponentially after introduction like ChatGPT GPT4. We refer its successor OpenAI models, including GPT4, as family (GLLMs). With ever-rising GLLMs, especially research community, there is strong need for comprehensive survey summarizes recent progress multiple dimensions can guide community insightful future directions. start paper foundation concepts transformers, transfer learning, self-supervised models. then present brief overview GLLMs discuss various downstream tasks, specific domains languages. also data labelling augmentation robustness effectiveness evaluators, finally, conclude To summarize, this will serve good resource both academic industry people stay updated latest related GLLMs.

Language: Английский

Citations

122

GPT is an effective tool for multilingual psychological text analysis DOI Open Access
Steve Rathje, Dan-Mircea Mirea, Ilia Sucholutsky

et al.

Published: May 19, 2023

The social and behavioral sciences have been increasingly using automated text analysis to measure psychological constructs in text. We explore whether GPT, the large-language model underlying artificial intelligence chatbot ChatGPT, can be used as a tool for several languages. Across 15 datasets (n = 47,925 manually annotated tweets news headlines), we tested different versions of GPT (3.5 Turbo, 4, 4 Turbo) accurately detect (sentiment, discrete emotions, offensiveness, moral foundations) across 12 found that (r 0.59-0.77) performs much better than English-language dictionary 0.20-0.30) at detecting judged by manual annotators. nearly well as, sometimes than, top-performing fine-tuned machine learning models. Moreover, GPT’s performance has improved successive model, particularly lesser-spoken Overall, may superior many existing methods analysis, since it achieves relatively high accuracy languages, requires no training data, is easy use with simple prompts (e.g., “is this negative?”) little coding experience. provide sample code video tutorial analyzing application programming interface. argue other models democratize making advanced natural language processing capabilities more accessible, help facilitate cross-linguistic research understudied

Language: Английский

Citations

110

GPT is an effective tool for multilingual psychological text analysis DOI Creative Commons
Steve Rathje, Dan-Mircea Mirea, Ilia Sucholutsky

et al.

Proceedings of the National Academy of Sciences, Journal Year: 2024, Volume and Issue: 121(34)

Published: Aug. 12, 2024

The social and behavioral sciences have been increasingly using automated text analysis to measure psychological constructs in text. We explore whether GPT, the large-language model (LLM) underlying AI chatbot ChatGPT, can be used as a tool for several languages. Across 15 datasets ( n = 47,925 manually annotated tweets news headlines), we tested different versions of GPT (3.5 Turbo, 4, 4 Turbo) accurately detect (sentiment, discrete emotions, offensiveness, moral foundations) across 12 found that r 0.59 0.77) performed much better than English-language dictionary 0.20 0.30) at detecting judged by manual annotators. nearly well as, sometimes than, top-performing fine-tuned machine learning models. Moreover, GPT’s performance improved successive model, particularly lesser-spoken languages, became less expensive. Overall, may superior many existing methods analysis, since it achieves relatively high accuracy requires no training data, is easy use with simple prompts (e.g., “is this negative?”) little coding experience. provide sample code video tutorial analyzing application programming interface. argue other LLMs help democratize making advanced natural language processing capabilities more accessible, facilitate cross-linguistic research understudied

Language: Английский

Citations

57

GenAI against humanity: nefarious applications of generative artificial intelligence and large language models DOI Creative Commons
Emilio Ferrara

Journal of Computational Social Science, Journal Year: 2024, Volume and Issue: 7(1), P. 549 - 569

Published: Feb. 22, 2024

Language: Английский

Citations

54

Durably reducing conspiracy beliefs through dialogues with AI DOI
Thomas H. Costello, Gordon Pennycook, David G. Rand

et al.

Science, Journal Year: 2024, Volume and Issue: 385(6714)

Published: Sept. 12, 2024

Conspiracy theory beliefs are notoriously persistent. Influential hypotheses propose that they fulfill important psychological needs, thus resisting counterevidence. Yet previous failures in correcting conspiracy may be due to counterevidence being insufficiently compelling and tailored. To evaluate this possibility, we leveraged developments generative artificial intelligence engaged 2190 believers personalized evidence-based dialogues with GPT-4 Turbo. The intervention reduced belief by ~20%. effect remained 2 months later, generalized across a wide range of theories, occurred even among participants deeply entrenched beliefs. Although the focused on single conspiracy, nonetheless diminished unrelated conspiracies shifted conspiracy-related behavioral intentions. These findings suggest many can revise their views if presented sufficiently evidence.

Language: Английский

Citations

37

AI for social science and social science of AI: A survey DOI
R. F. Xu, Yingfei Sun,

Mengjie Ren

et al.

Information Processing & Management, Journal Year: 2024, Volume and Issue: 61(3), P. 103665 - 103665

Published: Feb. 8, 2024

Language: Английский

Citations

32

Analyzing and Mitigating Cultural Hallucinations of Commercial Language Models in Turkish DOI Creative Commons
Yiğithan Boztemir, Nilüfer Çalışkan

Published: May 7, 2024

In an era where artificial intelligence is increasingly interfacing with diverse cultural contexts, the ability of language models to accurately represent and adapt these contexts paramount importance.The present research undertakes a meticulous evaluation three prominent commercial models-Google Gemini 1.5, ChatGPT-4, Anthropic's Claude 3 Sonet-with focus on their handling Turkish language.Through dual approach quantitative metrics, Cultural Inaccuracy Score (CIS) Sensitivity Index (CSI), alongside qualitative analyses via detailed case studies, disparities in model performances were highlighted.Notably, Sonet exhibited superior sensitivity, underscoring effectiveness its advanced training methodologies.Further analysis revealed that all demonstrated varying degrees competence, suggesting significant room for improvement.The findings emphasize necessity enriched diversified datasets, innovative algorithmic enhancements, reduce inaccuracies enhance models' global applicability.Strategies mitigating hallucinations are discussed, focusing refinement processes continuous foster improvements AI adaptiveness.The study aims contribute ongoing technologies, ensuring they respect reflect rich tapestry human cultures.

Language: Английский

Citations

24

Human-LLM Collaborative Annotation Through Effective Verification of LLM Labels DOI Open Access
Xinru Wang, Hannah Kim, Sajjadur Rahman

et al.

Published: May 11, 2024

Large language models (LLMs) have shown remarkable performance across various natural processing (NLP) tasks, indicating their significant potential as data annotators. Although LLM-generated annotations are more cost-effective and efficient to obtain, they often erroneous for complex or domain-specific tasks may introduce bias when compared human annotations. Therefore, instead of completely replacing annotators with LLMs, we need leverage the strengths both LLMs humans ensure accuracy reliability This paper presents a multi-step human-LLM collaborative approach where (1) generate labels provide explanations, (2) verifier assesses quality labels, (3) re-annotate subset lower verification scores. To facilitate collaboration, make use LLM's ability rationalize its decisions. explanations can additional information model well help better understand LLM labels. We demonstrate that our is able identify potentially incorrect re-annotation. Furthermore, investigate impact presenting on re-annotation through crowdsourced studies.

Language: Английский

Citations

21

Reducing LLM Hallucination Using Knowledge Distillation: A Case Study with Mistral Large and MMLU Benchmark DOI Creative Commons
Daniel McDonald, Rachael Papadopoulos, Leslie Benningfield

et al.

Published: May 25, 2024

The application of knowledge distillation to reduce hallucination in large language models represents a novel and significant advancement enhancing the reliability accuracy AI-generated content. research presented demonstrates efficacy transferring from high-capacity teacher model more compact student model, leading substantial improvements exact match notable reductions rates. methodology involved use temperature scaling, intermediate layer matching, comprehensive evaluation using MMLU benchmark, which assessed model’s performance across diverse set tasks. Experimental results indicated that distilled outperformed baseline generating accurate contextually appropriate responses while maintaining computational efficiency. findings underscore potential as scalable solution for improving robustness models, making them applicable real-world scenarios demand high factual accuracy. Future directions include exploring multilingual multi-modal distillation, integrating reinforcement learning, developing refined metrics further enhance performance.

Language: Английский

Citations

20