Cross-Lingual Factual Accuracy and Ideological Divergence in Large Language Models DOI Open Access

Cheng-en Tsai,

Mei-chi Huang

Опубликована: Июнь 10, 2024

The novel concept of cross-lingual content factual accuracy verification explores the consistency and reliability responses produced by such models when posed with identical questions in English Chinese. This study meticulously analyzed performance ChatGPT Google Gemini, revealing high alignment but notable divergences ideologically sensitive areas, attributed to cultural ideological biases training data. A comprehensive methodology incorporating both quantitative metrics qualitative assessments was employed evaluate capabilities these models. results demonstrate potential language multilingual applications while highlighting critical need for bias mitigation strategies. implications extend enhancing development deployment AI systems diverse contexts, emphasizing importance neutrality handling information. research contributes significantly understanding strengths limitations verification, providing a foundation future improvements methodologies applications.

Язык: Английский

Hallucination Reduction in Large Language Models with Retrieval-Augmented Generation Using Wikipedia Knowledge DOI Open Access

Jason Kirchenbauer,

Caleb Barns

Опубликована: Май 30, 2024

Natural language understanding and generation have seen great progress, yet the persistent issue of hallucination undermines reliability model outputs. Introducing retrieval-augmented (RAG) with external knowledge sources, such as Wikipedia, presents a novel significant approach to enhancing factual accuracy coherence in generated content. By dynamically integrating relevant information, Mistral demonstrates substantial improvements precision, recall, overall quality responses. This research offers robust framework for mitigating hallucinations, providing valuable insights deploying reliable AI systems critical applications. The comprehensive evaluation underscores potential RAG advance performance trustworthiness large models.

Язык: Английский

Процитировано

23

Reducing Hallucinations in Large Language Models Through Contextual Position Encoding DOI Open Access

Sarah Desrochers,

James Wilson,

Matthew Beauchesne

и другие.

Опубликована: Май 31, 2024

In natural language processing, maintaining factual accuracy and minimizing hallucinations in text generation remain significant challenges. Contextual Position Encoding (CPE) presents a novel approach by dynamically encoding positional information based on the context of each token, significantly enhancing model's ability to generate accurate coherent text. The integration CPE into Mistral Large model resulted marked improvements precision, recall, F1-score, demonstrating superior performance over traditional methods. Furthermore, enhanced architecture effectively reduced hallucination rates, increasing reliability generated outputs. Comparative analysis with baseline models such as GPT-3 BERT confirmed efficacy CPE, highlighting its potential influence future developments LLM architecture. results underscore importance advanced techniques improving applicability large across various domains requiring high accuracy.

Язык: Английский

Процитировано

20

Combining LoRA to GPT-Neo to Reduce Large Language Model Hallucination DOI Creative Commons

Shi-han Huang,

Chia-Yu Chen

Research Square (Research Square), Год журнала: 2024, Номер unknown

Опубликована: Июнь 4, 2024

Abstract The deployment of Large Language Models (LLMs) often suffers from generating hallucinations, leading to outputs that appear plausible but are factually inaccurate or nonsensical. Incorporating Low-Rank Adaptation (LoRA) into GPT-Neo presents a novel approach mitigating these hallucinations by leveraging the efficiency low-rank approximations. This research details integration LoRA GPT-Neo, demonstrating significant improvements in predictive performance, factual accuracy, and reduction hallucination rates. augmented model shows enhanced robustness efficiency, making it more suitable for applications requiring high accuracy reliability. Through comprehensive evaluations involving perplexity, BLEU, ROUGE-L scores, qualitative analysis, study highlights model's ability generate coherent contextually appropriate text. findings demonstrate potential transform LLM reducing computational complexity memory footprint, thus facilitating use large-scale models resource-constrained environments. advancement opens new possibilities across various domains, ensuring coherence generated content.

Язык: Английский

Процитировано

15

Dynamic Supplementation of Federated Search Results for Reducing Hallucinations in LLMs DOI Open Access
Jichang Chen,

Xinnan Huang,

Yongping Li

и другие.

Опубликована: Июнь 6, 2024

The increasing use of AI-generated content has highlighted the critical issue hallucinations, where models produce factually incorrect or misleading outputs. Addressing this challenge, a novel approach dynamically supplements federated search engine results in real-time to significantly reduce hallucinations and enhance response accuracy. methodology involves integrating data from multiple engines into responses generated by Mistral Large model, thereby providing more accurate contextually appropriate output. Comprehensive evaluation using Microsoft PromptBench dataset demonstrates substantial improvements accuracy, relevance, reduction hallucinations. Quantitative performance metrics, statistical analysis, detailed case studies confirm effectiveness dynamic supplementation approach. findings suggest significant implications for developing reliable AI applications across various domains, emphasizing potential hybrid systems that combine strengths large language information retrieval. Future research directions include refining triggering mechanisms, expanding sources, optimizing process further scalability.

Язык: Английский

Процитировано

12

Efficient Large Language Model Inference with Vectorized Floating Point Calculations DOI Open Access

Jacob Owens,

Skylar Matthews

Опубликована: Июнь 13, 2024

The development of highly sophisticated language models has revolutionized various natural processing tasks, demanding efficient inference processes to ensure real-time responsiveness and minimal computational resource usage. Vectorized floating point calculations present a novel significant approach enhancing the efficiency model inference, leveraging parallel capabilities achieve substantial performance improvements. This article details implementation vectorized within GPT-Neo, demonstrating notable 12\% increase in speed through comprehensive benchmarks datasets. evaluation highlights optimized model's ability reduce time, throughput, lower memory usage energy consumption without compromising accuracy. findings reveal potential operations enhance scalability operational advanced models, paving way for more responsive resource-efficient AI applications across diverse deployment scenarios.

Язык: Английский

Процитировано

10

Large Language Model Understands Chinese Better with Mega Tokenization DOI Creative Commons

Xinyu Lu,

Qizhen Wang,

Xian Liu

и другие.

Research Square (Research Square), Год журнала: 2024, Номер unknown

Опубликована: Июнь 10, 2024

Abstract The rapid evolution of natural language processing has seen significant advancements in models, particularly for languages with simpler orthographies. However, challenges persist accurately and understanding complex morphological structures, such as Chinese, due to the limitations traditional tokenization methods. Introducing mega tokenization, which involves significantly larger tokens, represents a novel transformative approach that enhances semantic preservation contextual coherence sophisticated character sequences. study compares performance an adapted model against standard model, demonstrating substantial improvements across tasks machine translation, text summarisation, question answering. Through rigorous evaluation statistical analysis, shows superior metrics, indicating effectiveness addressing unique posed by Chinese language. implications this extend various applications, underscoring its potential revolutionise multilingual high-stakes environments. Future research directions are proposed further optimise expand applicability diverse linguistic contexts.

Язык: Английский

Процитировано

3

Evaluating Abstract Reasoning and Problem-Solving Abilities of Large Language Models Using Raven's Progressive Matrices DOI Creative Commons

C. C. Zhang,

Liuyun Wang

Research Square (Research Square), Год журнала: 2024, Номер unknown

Опубликована: Июнь 11, 2024

Abstract Artificial intelligence has rapidly evolved, leading to the development of powerful models capable performing complex cognitive tasks. Evaluating abilities these through established human tests such as Raven's Progressive Matrices (RPM) offers a novel and significant approach understanding their abstract reasoning capabilities. The study adapted RPM for text-based interactions, enabling evaluation Mistral Llama without intervention. Results revealed that both surpass average performance in overall accuracy, demonstrating advanced problem-solving skills. However, analysis also highlighted variability across different types tasks, with excelling sequential pattern recognition showing weaknesses spatial awareness. These findings provide valuable insights into strengths limitations Llama, offering comprehensive guiding future advancements artificial intelligence.

Язык: Английский

Процитировано

3

Cross-Lingual Factual Accuracy and Ideological Divergence in Large Language Models DOI Open Access

Cheng-en Tsai,

Mei-chi Huang

Опубликована: Июнь 10, 2024

The novel concept of cross-lingual content factual accuracy verification explores the consistency and reliability responses produced by such models when posed with identical questions in English Chinese. This study meticulously analyzed performance ChatGPT Google Gemini, revealing high alignment but notable divergences ideologically sensitive areas, attributed to cultural ideological biases training data. A comprehensive methodology incorporating both quantitative metrics qualitative assessments was employed evaluate capabilities these models. results demonstrate potential language multilingual applications while highlighting critical need for bias mitigation strategies. implications extend enhancing development deployment AI systems diverse contexts, emphasizing importance neutrality handling information. research contributes significantly understanding strengths limitations verification, providing a foundation future improvements methodologies applications.

Язык: Английский

Процитировано

0