Gemini or ChatGPT? Capability, Performance, and Selection of Cutting-Edge Generative Artificial Intelligence (AI) in Business Management DOI

Nitin Rane,

Saurabh Choudhary, Jayesh Rane

и другие.

SSRN Electronic Journal, Год журнала: 2024, Номер unknown

Опубликована: Янв. 1, 2024

Язык: Английский

Dynamic Supplementation of Federated Search Results for Reducing Hallucinations in LLMs DOI Open Access
Jichang Chen,

Xinnan Huang,

Yongping Li

и другие.

Опубликована: Июнь 6, 2024

The increasing use of AI-generated content has highlighted the critical issue hallucinations, where models produce factually incorrect or misleading outputs. Addressing this challenge, a novel approach dynamically supplements federated search engine results in real-time to significantly reduce hallucinations and enhance response accuracy. methodology involves integrating data from multiple engines into responses generated by Mistral Large model, thereby providing more accurate contextually appropriate output. Comprehensive evaluation using Microsoft PromptBench dataset demonstrates substantial improvements accuracy, relevance, reduction hallucinations. Quantitative performance metrics, statistical analysis, detailed case studies confirm effectiveness dynamic supplementation approach. findings suggest significant implications for developing reliable AI applications across various domains, emphasizing potential hybrid systems that combine strengths large language information retrieval. Future research directions include refining triggering mechanisms, expanding sources, optimizing process further scalability.

Язык: Английский

Процитировано

12

Reducing Cultural Hallucination in Non-English Languages Via Prompt Engineering for Large Language Models DOI Open Access

Kanato SATO,

Haruto Kaneko,

Mei Fujimura

и другие.

Опубликована: Май 6, 2024

Advancements in prompt engineering offer significant potential for mitigating cultural hallucinations large language models (LLMs). The strategic formulation of prompts, when combined with deep and linguistic insights, enhances the accuracy sensitivity LLMs, particularly non-English contexts. This paper explores application across three major LLMs—OpenAI ChatGPT, Google Gemini, Anthropic Claude—demonstrating how tailored prompts can effectively reduce biases improve user interaction. Through case studies comparative analysis, research identifies best practices provides recommendations further development. findings emphasize importance continuous innovation ethical considerations AI to ensure inclusivity respect diversity global technology applications.

Язык: Английский

Процитировано

11

Measuring the Visual Hallucination in ChatGPT on Visually Deceptive Images DOI Open Access

Linzhi Ping,

Yue Gu,

Liefeng Feng

и другие.

Опубликована: Май 28, 2024

The evaluation of visual hallucinations in multimodal AI models is novel and significant because it addresses a critical gap understanding how systems interpret deceptive inputs. study systematically assessed ChatGPT's performance on synthetic dataset visually non-deceptive images, employing both quantitative qualitative analysis. Results revealed that while ChatGPT achieved high accuracy standard recognition tasks, its diminished when faced with highlighting areas for further improvement. analysis provided insights into the model's underlying mechanisms, such as extensive pretraining sophisticated integration capabilities, which contribute to robustness against deceptions. study's findings have important implications development more reliable robust technologies, offering benchmark future evaluations practical guidelines enhancing systems.

Язык: Английский

Процитировано

9

CyberQ: Generating Questions and Answers for Cybersecurity Education Using Knowledge Graph-Augmented LLMs DOI Open Access
Garima Agrawal, Kuntal Kumar Pal, Yuli Deng

и другие.

Proceedings of the AAAI Conference on Artificial Intelligence, Год журнала: 2024, Номер 38(21), С. 23164 - 23172

Опубликована: Март 24, 2024

Building a skilled cybersecurity workforce is paramount to building safer digital world. However, the diverse skill set, constantly emerging vulnerabilities, and deployment of new cyber threats make learning challenging. Traditional education methods struggle cope with cybersecurity's rapidly evolving landscape keep students engaged motivated. Different studies on students' behaviors show that an interactive mode by engaging through question-answering system or dialoguing one most effective methodologies. There strong need create advanced AI-enabled tools promote in cybersecurity. Unfortunately, there are no publicly available standard question-answer datasets build such systems for novice learners learn concepts, tools, techniques. The course material online question banks unstructured be validated updated domain experts, which tedious when done manually. In this paper, we propose CyberGen, novel unification large language models (LLMs) knowledge graphs (KG) generate questions answers automatically. Augmenting structured from prompts improves factual reasoning reduces hallucinations LLMs. We used triples (AISecKG) design ChatGPT using different prompting Our dataset, CyberQ, contains around 4k pairs answers. expert manually evaluated random samples consistency correctness. train generative model CyberQ dataset answering task.

Язык: Английский

Процитировано

8

Enhancing Contextual Understanding of Mistral LLM with External Knowledge Bases DOI Creative Commons

Miyu Sasaki,

Natsumi Watanabe,

Tsukihito Komanaka

и другие.

Research Square (Research Square), Год журнала: 2024, Номер unknown

Опубликована: Апрель 5, 2024

Abstract This study explores the enhancement of contextual understanding and factual accuracy in Language Learning Models (LLMs), specifically Mistral LLM, through integration external knowledge bases. We developed a novel methodology for dynamically incorporating real-time information from diverse sources, aiming to address inherent limitations LLMs rooted their training datasets. Our experiments demonstrated significant improvements accuracy, precision, recall, F1 score, alongside qualitative enhancements response relevance accuracy. The research also tackled computational challenges integrating knowledge, ensuring model's efficiency practical applicability. work not only highlights potential bases augment capabilities but sets stage future advancements creating more intelligent, adaptable, contextually aware AI systems. findings contribute broader field NLP by offering insights into overcoming traditional LLMs, presenting step toward developing systems with enhanced real-world applicability accessibility.

Язык: Английский

Процитировано

8

Editing Language Model-Based Knowledge Graph Embeddings DOI Open Access
Siyuan Cheng, Ningyu Zhang,

Bozhong Tian

и другие.

Proceedings of the AAAI Conference on Artificial Intelligence, Год журнала: 2024, Номер 38(16), С. 17835 - 17843

Опубликована: Март 24, 2024

Recently decades have witnessed the empirical success of framing Knowledge Graph (KG) embeddings via language models. However, model-based KG are usually deployed as static artifacts, making them difficult to modify post-deployment without re-training after deployment. To address this issue, we propose a new task editing in paper. This is designed facilitate rapid, data-efficient updates compromising performance other aspects. We build four datasets: E-FB15k237, A-FB15k237, E-WN18RR, and A-WN18RR, evaluate several knowledge baselines demonstrating limited ability previous models handle proposed challenging task. further simple yet strong baseline dubbed KGEditor, which utilizes additional parametric layers hypernetwork edit/add facts. Our comprehensive experimental results reveal that KGEditor excels updating specific facts impacting overall performance, even when faced with training resources. Code datasets will be available at https://github.com/AnonymousForPapers/DeltaKG.

Язык: Английский

Процитировано

7

Designing Incremental Knowledge Enrichment in Generative Pre-trained Transformers DOI Creative Commons
Emilia A. Kowalczyk, Mateusz Nowakowski,

Z Brzezińska

и другие.

Research Square (Research Square), Год журнала: 2024, Номер unknown

Опубликована: Апрель 1, 2024

Abstract This article presents a novel approach to Incremental Knowledge Enrichment tailored for GPT-Neo, addressing the challenge of keeping Large Language Models (LLMs) updated with latest information without undergoing comprehensive retraining. We introduce dynamic linking mechanism that enables real-time integration diverse data sources, thereby enhancing model's accuracy, timeliness, and relevance. Through rigorous evaluation, our method demonstrates significant improvements in model performance across several metrics. The research contributes scalable efficient solution one most pressing issues AI, potentially revolutionizing maintenance applicability LLMs. findings underscore feasibility creating more adaptive, responsive, sustainable generative models, opening new avenues future advancements field.

Язык: Английский

Процитировано

7

An Enhanced Prompt-Based LLM Reasoning Scheme via Knowledge Graph-Integrated Collaboration DOI
Yihao Li, Ru Zhang, Jianyi Liu

и другие.

Lecture notes in computer science, Год журнала: 2024, Номер unknown, С. 251 - 265

Опубликована: Янв. 1, 2024

Язык: Английский

Процитировано

7

The Alzheimer’s Knowledge Base: A Knowledge Graph for Alzheimer Disease Research DOI Creative Commons
Joseph D. Romano, Van Truong, Rachit Kumar

и другие.

Journal of Medical Internet Research, Год журнала: 2023, Номер 26, С. e46777 - e46777

Опубликована: Ноя. 7, 2023

Background As global populations age and become susceptible to neurodegenerative illnesses, new therapies for Alzheimer disease (AD) are urgently needed. Existing data resources drug discovery repurposing fail capture relationships central the disease’s etiology response drugs. Objective We designed Alzheimer’s Knowledge Base (AlzKB) alleviate this need by providing a comprehensive knowledge representation of AD candidate therapeutics. Methods AlzKB as large, heterogeneous graph base assembled using 22 diverse external sources describing biological pharmaceutical entities at different levels organization (eg, chemicals, genes, anatomy, diseases). uses Web Ontology Language 2 ontology enforce semantic consistency allow ontological inference. provide public version users run modify local versions base. Results is freely available on web currently contains 118,902 with 1,309,527 between those entities. To demonstrate its value, we used science machine learning (1) propose therapeutic targets based similarities Parkinson (2) repurpose existing drugs that may treat AD. For each use case, recovers known associations while proposing biologically plausible ones. Conclusions new, publicly resource enables researchers discover complex translational discovery. Through cases, show it valuable tool novel hypotheses biomedical knowledge.

Язык: Английский

Процитировано

14

Higher Performance of Mistral Large on MMLU Benchmark through Two-Stage Knowledge Distillation DOI Creative Commons
J R Wilkins, Michael Rodriguez

Research Square (Research Square), Год журнала: 2024, Номер unknown

Опубликована: Май 14, 2024

Abstract Large language models (LLM) have undergone significant transformations through the application of knowledge distillation techniques aimed at enhancing performance on complex benchmarks like MMLU. The research detailed herein introduces a novel two-stage process designed to refine capabilities Mistral Large, resulting in marked improvements both accuracy and contextual understanding. Initially, model undergoes teacher-student training phase where high-performing teacher imparts its less student model, utilizing soft hard target methods optimize transfer. This is followed by specialized refinement stage further fine-tuned tasks that require advanced cognitive skills, specifically tailored challenges presented MMLU benchmark. Quantitative results indicate substantial increase across various within benchmark, while qualitative analyses show enhanced linguistic sophistication relevance model's responses. Comparisons with baseline confirm distilled significantly outperforms traditional approaches, setting new standards for models. implications our findings suggest structured can fundamentally alter development trajectory models, making them more efficient effective diverse applications. study's approach offers scalable framework future enhancements has potential influence wide range applications artificial intelligence, from automated conversational systems sophisticated analytical tools.

Язык: Английский

Процитировано

5