Regulating Generative AI: Ethical Considerations and Explainability Benchmarks DOI Open Access
C.K. Luk,

Hoi-Lam Chung,

Wai-Kuen Yim

и другие.

Опубликована: Март 20, 2024

This study looks into the critical discussion surrounding ethical regulation and explainability of generative artificial intelligence (AI). Amidst rapid advancement AI technologies, this paper identifies explores multifaceted concerns that arise, highlighting paramount importance transparency, accountability, fairness. Through an examination existing regulatory frameworks introduction novel benchmarks for explainability, advocates a balanced approach fosters innovation while ensuring oversight. Case studies illustrate dual potential to benefit society pose significant challenges, underscoring complexity its integration various domains. The findings emphasize necessity dynamic mechanisms, interdisciplinary collaboration, ongoing research navigate landscape AI, aiming harness capabilities responsibly betterment humanity.

Язык: Английский

Reducing LLM Hallucination Using Knowledge Distillation: A Case Study with Mistral Large and MMLU Benchmark DOI Creative Commons
Daniel McDonald, Rachael Papadopoulos, Leslie Benningfield

и другие.

Опубликована: Май 25, 2024

The application of knowledge distillation to reduce hallucination in large language models represents a novel and significant advancement enhancing the reliability accuracy AI-generated content. research presented demonstrates efficacy transferring from high-capacity teacher model more compact student model, leading substantial improvements exact match notable reductions rates. methodology involved use temperature scaling, intermediate layer matching, comprehensive evaluation using MMLU benchmark, which assessed model’s performance across diverse set tasks. Experimental results indicated that distilled outperformed baseline generating accurate contextually appropriate responses while maintaining computational efficiency. findings underscore potential as scalable solution for improving robustness models, making them applicable real-world scenarios demand high factual accuracy. Future directions include exploring multilingual multi-modal distillation, integrating reinforcement learning, developing refined metrics further enhance performance.

Язык: Английский

Процитировано

20

Reducing Hallucinations in Large Language Models Through Contextual Position Encoding DOI Open Access

Sarah Desrochers,

James Wilson,

Matthew Beauchesne

и другие.

Опубликована: Май 31, 2024

In natural language processing, maintaining factual accuracy and minimizing hallucinations in text generation remain significant challenges. Contextual Position Encoding (CPE) presents a novel approach by dynamically encoding positional information based on the context of each token, significantly enhancing model's ability to generate accurate coherent text. The integration CPE into Mistral Large model resulted marked improvements precision, recall, F1-score, demonstrating superior performance over traditional methods. Furthermore, enhanced architecture effectively reduced hallucination rates, increasing reliability generated outputs. Comparative analysis with baseline models such as GPT-3 BERT confirmed efficacy CPE, highlighting its potential influence future developments LLM architecture. results underscore importance advanced techniques improving applicability large across various domains requiring high accuracy.

Язык: Английский

Процитировано

20

Efficiency in Language Understanding and Generation: An Evaluation of Four Open-Source Large Language Models DOI Creative Commons
Siu Ming Wong, Ho-fung Leung,

Ka Yan Wong

и другие.

Research Square (Research Square), Год журнала: 2024, Номер unknown

Опубликована: Март 11, 2024

Abstract This study provides a comprehensive evaluation of the efficiency Large Language Models (LLMs) in performing diverse language understanding and generation tasks. Through systematic comparison open-source models including GPT-Neo, Bloom, FLAN-T5, Mistral-7B, research explores their performance across widely recognized benchmarks such as GLUE, SuperGLUE, LAMBADA, SQuAD. Our findings reveal significant variations model accuracy, computational efficiency, scalability, adaptability, underscoring influence architecture training paradigms on outcomes. The identifies key factors contributing to models' offers insights into potential optimization strategies for enhancing applicability real-world NLP applications. By highlighting strengths limitations current LLMs, this contributes ongoing development more effective, efficient, adaptable models, paving way future advancements field natural processing.

Язык: Английский

Процитировано

18

Equipping Llama with Google Query API for Improved Accuracy and Reduced Hallucination DOI Creative Commons

Young Hwan Bae,

Hye Rin Kim,

Jae‐Hoon Kim

и другие.

Research Square (Research Square), Год журнала: 2024, Номер unknown

Опубликована: Март 6, 2024

Abstract This study investigates the integration of Llama 2 7b large language model (LLM) with Google Query API to enhance its accuracy and reduce hallucination instances. By leveraging real-time internet data, we aimed address limitations static training datasets improve model's performance across various processing tasks. The methodology involved augmenting 7b's architecture incorporate dynamic data retrieval from API, followed by an evaluation impact on reduction using BIG-Bench benchmark. results indicate significant improvements in both reliability, demonstrating effectiveness integrating LLMs external sources. not only marks a substantial advancement capabilities but also raises important considerations regarding bias, privacy, ethical use internet-sourced information. study's findings contribute ongoing discourse enhancing LLMs, suggesting promising direction for future research development artificial intelligence.

Язык: Английский

Процитировано

18

Combining LoRA to GPT-Neo to Reduce Large Language Model Hallucination DOI Creative Commons

Shi-han Huang,

Chia-Yu Chen

Research Square (Research Square), Год журнала: 2024, Номер unknown

Опубликована: Июнь 4, 2024

Abstract The deployment of Large Language Models (LLMs) often suffers from generating hallucinations, leading to outputs that appear plausible but are factually inaccurate or nonsensical. Incorporating Low-Rank Adaptation (LoRA) into GPT-Neo presents a novel approach mitigating these hallucinations by leveraging the efficiency low-rank approximations. This research details integration LoRA GPT-Neo, demonstrating significant improvements in predictive performance, factual accuracy, and reduction hallucination rates. augmented model shows enhanced robustness efficiency, making it more suitable for applications requiring high accuracy reliability. Through comprehensive evaluations involving perplexity, BLEU, ROUGE-L scores, qualitative analysis, study highlights model's ability generate coherent contextually appropriate text. findings demonstrate potential transform LLM reducing computational complexity memory footprint, thus facilitating use large-scale models resource-constrained environments. advancement opens new possibilities across various domains, ensuring coherence generated content.

Язык: Английский

Процитировано

15

Integrating Deep Learning with Symbolic Reasoning in TinyLlama for Accurate Information Retrieval DOI Creative Commons
Xingyu Xiong, Mingliang Zheng

Research Square (Research Square), Год журнала: 2024, Номер unknown

Опубликована: Янв. 23, 2024

Abstract This study presents a novel approach to enhancing information retrieval capabilities in Large Language Models (LLMs) by integrating deep learning with symbolic reasoning, specifically the TinyLlama model. The research addresses inherent limitations of LLMs processing contextually complex queries and ensuring factual accuracy. By amalgamating intuitive pattern recognition structured, rule-based logic improved model demonstrates significant elevation performance. employs BIG-bench benchmark tasks empirically validate model's enhancements accuracy, logical consistency, rule adherence. Additionally, emphasizes importance interpretability trust, positioning hybrid as more transparent reliable AI tool. findings not only showcase efficacy architecture but also pave way for future research, focusing on sophisticated cognitive functions autonomous adaptation dynamic environments. work sets precedent evolution LLMs, moving towards systems capable nuanced reasoning akin human processes.

Язык: Английский

Процитировано

14

Evaluating Privacy Compliance in Commercial Large Language Models - ChatGPT, Claude, and Gemini DOI Creative Commons

Oliver Cartwright,

H. Flanders Dunbar,

Theo Radcliffe

и другие.

Research Square (Research Square), Год журнала: 2024, Номер unknown

Опубликована: Июль 26, 2024

Abstract The integration of artificial intelligence systems into various domains has raised significant privacy concerns, necessitating stringent regulatory measures to protect user data. Evaluating the compliance commercial large language models (LLMs) such as ChatGPT-4o, Claude Sonet, and Gemini Flash under EU AI Act presents a novel approach, providing critical insights their adherence standards. study utilized hypothetical case studies assess practices these LLMs, focusing on data collection, storage, sharing mechanisms. Findings revealed that ChatGPT-4o exhibited issues with minimization access control, while Sonet demonstrated robust effective security measures. However, showed inconsistencies in collection higher incidence anonymization failures. comparative analysis underscored importance tailored strategies continuous monitoring ensure compliance. These results provide valuable for developers policymakers, emphasizing necessity multifaceted approach deployment LLMs.

Язык: Английский

Процитировано

13

Implementing Automated Error Correction and Feedback Loops in Kimi, A Chinese Large Language Model DOI Open Access

Wai-lam Cheung,

C.K. Luk

Опубликована: Апрель 24, 2024

The enhancement of the Chinese Large Language Model, Kimi, through integration automated error correction mechanisms and feedback loops, was explored in this study. primary objective to develop implement a system that reduces linguistic errors real-time adapts dynamically evolving language patterns without extensive retraining. Using combination natural processing techniques machine learning algorithms, demonstrated significant improvements accuracy, precision, recall, user satisfaction compared baseline model. introduction adaptive components enabled continuous improvement user-driven model adaptation. findings indicate such enhancements can substantially increase reliability efficiency Models, particularly non-English contexts, setting precedent for future research development field. study’s implications extend broader applications AI, suggesting potential other models AI systems requiring high sensitivity adaptability.

Язык: Английский

Процитировано

12

A Longchain Approach to Reduce Hallucinations in Large Language Models DOI Creative Commons

Jinchao Li,

Quan Hong

Research Square (Research Square), Год журнала: 2024, Номер unknown

Опубликована: Июнь 5, 2024

Abstract The increasing deployment of natural language processing models in critical domains necessitates addressing the issue hallucinations, where generated outputs may be factually incorrect or nonsensical. longchain approach, which involves an iterative refinement process, offers a novel and significant method to mitigate hallucinations by enhancing both accuracy coherence model outputs. methodology involved modifying GPT-3 architecture incorporate additional layers for intermediate evaluations corrections, followed rigorous training evaluation using MMLU dataset. Quantitative results demonstrated that modified significantly outperformed baseline across various performance metrics, including precision, recall, F1-score, logical coherence, hallucination rate. Qualitative analysis further supported these findings, showcasing practical benefits approach producing accurate contextually relevant study emphasizes theoretical foundations learning continuous improvement, providing robust framework reliability models. implications findings are substantial applications healthcare, legal advice, education, generation reliable text is paramount. By reducing improving contributes development more trustworthy effective

Язык: Английский

Процитировано

12

Dynamic Supplementation of Federated Search Results for Reducing Hallucinations in LLMs DOI Open Access
Jichang Chen,

Xinnan Huang,

Yongping Li

и другие.

Опубликована: Июнь 6, 2024

The increasing use of AI-generated content has highlighted the critical issue hallucinations, where models produce factually incorrect or misleading outputs. Addressing this challenge, a novel approach dynamically supplements federated search engine results in real-time to significantly reduce hallucinations and enhance response accuracy. methodology involves integrating data from multiple engines into responses generated by Mistral Large model, thereby providing more accurate contextually appropriate output. Comprehensive evaluation using Microsoft PromptBench dataset demonstrates substantial improvements accuracy, relevance, reduction hallucinations. Quantitative performance metrics, statistical analysis, detailed case studies confirm effectiveness dynamic supplementation approach. findings suggest significant implications for developing reliable AI applications across various domains, emphasizing potential hybrid systems that combine strengths large language information retrieval. Future research directions include refining triggering mechanisms, expanding sources, optimizing process further scalability.

Язык: Английский

Процитировано

12