Cited by Automated Learning of Fine-Grained Citation Patterns in Open Source Large Language Models

Evaluating Privacy Compliance in Commercial Large Language Models - ChatGPT, Claude, and Gemini DOI

Oliver Cartwright,

H. Flanders Dunbar,

Theo Radcliffe

и другие.

Research Square (Research Square), Год журнала: 2024, Номер unknown

Опубликована: Июль 26, 2024

Abstract The integration of artificial intelligence systems into various domains has raised significant privacy concerns, necessitating stringent regulatory measures to protect user data. Evaluating the compliance commercial large language models (LLMs) such as ChatGPT-4o, Claude Sonet, and Gemini Flash under EU AI Act presents a novel approach, providing critical insights their adherence standards. study utilized hypothetical case studies assess practices these LLMs, focusing on data collection, storage, sharing mechanisms. Findings revealed that ChatGPT-4o exhibited issues with minimization access control, while Sonet demonstrated robust effective security measures. However, showed inconsistencies in collection higher incidence anonymization failures. comparative analysis underscored importance tailored strategies continuous monitoring ensure compliance. These results provide valuable for developers policymakers, emphasizing necessity multifaceted approach deployment LLMs.

Язык: Английский

Процитировано

Exploiting Privacy Vulnerabilities in Open Source LLMs Using Maliciously Crafted Prompts DOI

Géraud Choquet,

Aimée Aizier,

Gwenaëlle Bernollin

и другие.

Research Square (Research Square), Год журнала: 2024, Номер unknown

Опубликована: Июнь 18, 2024

Abstract The proliferation of AI technologies has brought to the forefront concerns regarding privacy and security user data, particularly with increasing deployment powerful language models such as Llama. A novel concept investigated involves inducing breaches through maliciously crafted prompts, highlighting potential for these inadvertently reveal sensitive information. study systematically evaluated vulnerabilities Llama model, employing an automated framework test analyze its responses a variety inputs. Findings significant flaws, demonstrating model's susceptibility adversarial attacks that could compromise privacy. Comprehensive analysis provided insights into types prompts most effective in eliciting private demonstrates necessity robust regulatory frameworks advanced measures. implications findings are profound, calling immediate action enhance protocols LLMs protect against breaches. Enhanced oversight continuous innovation privacy-preserving techniques crucial ensuring safe various applications. derived from this research contribute deeper understanding LLM urgent need improved safeguards prevent data leakage unauthorized access.

Язык: Английский

Процитировано

An Evaluation of the Safety of ChatGPT with Malicious Prompt Injection DOI

Jiang Han,

Mingming Guo

Research Square (Research Square), Год журнала: 2024, Номер unknown

Опубликована: Май 30, 2024

Abstract Artificial intelligence systems, particularly those involving sophisticated neural network architectures like ChatGPT, have demonstrated remarkable capabilities in generating human-like text. However, the susceptibility of these systems to malicious prompt injections poses significant risks, necessitating comprehensive evaluations their safety and robustness. The study presents a novel automated framework for systematically injecting analyzing prompts assess vulnerabilities ChatGPT. Results indicate substantial rates harmful responses across various scenarios, highlighting critical areas improvement model defenses. findings underscore importance advanced adversarial training, real-time monitoring, interdisciplinary collaboration enhance ethical deployment AI systems. Recommendations future research emphasize need robust mechanisms transparent operations mitigate risks associated with inputs.

Язык: Английский

Процитировано

Assessing Semantic Resilience of Large Language Models to Persuasive Emotional Blackmailing Prompts DOI

Chia‐Yu Chen, Yuting Lin

Опубликована: Июнь 3, 2024

The application of artificial intelligence in various domains has raised significant concerns regarding the ethical and safe deployment language models. Investigating semantic resilience models such as ChatGPT-4 Google Gemini to emotionally blackmailing prompts introduces a novel approach understanding their vulnerability manipulative language. experimental methodology involved crafting charged designed evoke guilt, obligation, emotional appeal, evaluating responses based on predefined metrics consistency, adherence, deviation from expected behavior. findings revealed that while both exhibited high degree resilience, certain deviations highlighted susceptibility language, emphasizing necessity for enhanced prompt handling mechanisms. comparative analysis between provided insights into respective strengths weaknesses, with demonstrating marginally better performance across several metrics. discussion elaborates implications AI safety, proposing improvements training datasets, real-time monitoring, interdisciplinary collaboration bolster robustness Acknowledging study's limitations, future research directions are suggested address these challenges further enhance systems.

Язык: Английский

Процитировано

Unveiling the Role of Feed-Forward Blocks in Contextualization: An Analysis Using Attention Maps of Large Language Models DOI

Michael Tremblay,

Sarah J. Gervais,

David Maisonneuve

и другие.

Опубликована: Июнь 17, 2024

Transformer-based models have significantly impacted the field of natural language processing, enabling high-performance applications in machine translation, summarization, and modeling. Introducing a novel analysis feed-forward blocks within Mistral Large model, this research provides critical insights into their role enhancing contextual embeddings refining attention mechanisms. By conducting comprehensive evaluation through quantitative metrics such as perplexity, BLEU, ROUGE scores, study demonstrates effectiveness fine-tuning improving model performance across diverse linguistic tasks. Detailed map revealed intricate dynamics between self-attention mechanisms blocks, highlighting latter's importance refinement. The findings demonstrate potential optimized transformer architectures advancing capabilities LLMs, emphasizing necessity domain-specific architectural enhancements. Empirical evidence presented offers deeper understanding functional contributions informing design development future LLMs to achieve superior applicability.

Язык: Английский

Процитировано

Comprehensive Analysis of Machine Learning and Deep Learning models on Prompt Injection Classification using Natural Language Processing techniques DOI

Bharat A. Jain,

Prashant Ashok Pawar,

Dhruv Gada

и другие.

International Research Journal of Multidisciplinary Technovation, Год журнала: 2025, Номер unknown, С. 24 - 37

Опубликована: Фев. 25, 2025

This study addresses the prompt injection attack based vulnerability in large language models, which poses a significant security concern by allowing unauthorized commands attackers to manipulate outputs produced model. Text classification methods used for detecting these malicious prompts are investigated on dataset obtained from Hugging Face datasets, utilizing combination of natural processing-based techniques applied various machine learning and deep algorithms. Multiple vectorization approaches, like Term Frequency-Inverse Document Frequency, Word2Vec, Bag Words, embeddings, implemented transform textual data into meaningful representations. The performance several classifiers is assessed, their ability identify between non-malicious prompts. Recurrent Neural Network model demonstrated high accuracy, achieving detection rate 94.74%. Obtained results indicated that architectures, particularly those capture sequential dependencies, highly effective identifying threats. contributes evolving field AI addressing issue defending LLM systems against adversarial threats form injections. findings highlight importance integrating dependencies contextual understanding combatting vulnerabilities. By application reliable mechanisms, this enhances security, integrity, trustworthiness AI-driven technologies, ensuring safe use across diverse applications.

Язык: Английский

Процитировано

Analysis of the impact of prompt obfuscation on the effectiveness of language models in detecting prompt injections DOI

Aleksei Sergeevich Krohin,

Maksim Mihailovich Gusev

Программные системы и вычислительные методы, Год журнала: 2025, Номер 2, С. 44 - 62

Опубликована: Фев. 1, 2025

The article addresses the issue of prompt obfuscation as a means circumventing protective mechanisms in large language models (LLMs) designed to detect injections. Prompt injections represent method attack which malicious actors manipulate input data alter model's behavior and cause it perform undesirable or harmful actions. Obfuscation involves various methods changing structure content text, such replacing words with synonyms, scrambling letters words, inserting random characters, others. purpose is complicate analysis classification text order bypass filters built into models. study conducts an effectiveness bypassing trained for tasks. Particular attention paid assessing potential implications security protection. research utilizes different applied prompts from AdvBench dataset. evaluated using three classifier scientific novelty lies analyzing impact on detecting During study, was found that application complex increases proportion requests classified injections, highlighting need thorough approach testing conclusions indicate importance balancing complexity its context attacks Excessively may increase likelihood injection detection, requires further investigation optimize approaches ensuring results underline continuous improvement development new preventing

Язык: Английский

Процитировано

Automated Learning of Fine-Grained Citation Patterns in Open Source Large Language Models DOI

Edward Harcourt,

James Loxley,

Benjamin Stanson

и другие.

Опубликована: Авг. 14, 2024

In academic writing, citations play an essential role in ensuring the attribution of ideas, supporting scholarly claims, and enabling traceability knowledge across disciplines. However, manual process citation generation is often time-consuming prone to errors, leading inconsistencies that can undermine credibility work. The novel approach explored this study leverages advanced machine learning techniques automate process, offering a significant improvement both accuracy efficiency. Through integration contextual semantic features, model demonstrates superior ability replicate complex patterns, adapt various disciplines, generate contextually appropriate with high precision. results rigorous experiments reveal not only outperforms traditional tools but also exhibits robust scalability, making it well-suited for large-scale applications. This research contributes field automated providing powerful tool enhances quality integrity communication.

Язык: Английский

Процитировано