Cited by Exploring a long short-term memory for mountain flood forecasting based on watershed-internal knowledge graph and large language model

MINES: Message Intercommunication for Inductive Relation Reasoning over Neighbor-Enhanced Subgraphs DOI

Ke Liang, Lingyuan Meng, Sihang Zhou

et al.

Proceedings of the AAAI Conference on Artificial Intelligence, Journal Year: 2024, Volume and Issue: 38(9), P. 10645 - 10653

Published: March 24, 2024

GraIL and its variants have shown their promising capacities for inductive relation reasoning on knowledge graphs. However, the uni-directional message-passing mechanism hinders such models from exploiting hidden mutual relations between entities in directed Besides, enclosing subgraph extraction most GraIL-based restricts model extracting enough discriminative information reasoning. Consequently, expressive ability of these is limited. To address problems, we propose a novel framework, termed MINES, by introducing Message Intercommunication Neighbor-Enhanced Subgraph. Concretely, message intercommunication designed to capture omitted information. It introduces bi-directed interactions connected inserting an undirected/bi-directed GCN layer uni-directed RGCN layers. Moreover, inspired success involving more neighbors other graph-based tasks, extend neighborhood area beyond enhance collection Extensive experiments prove capacity proposed MINES various aspects, especially superiority, effectiveness, transfer ability.

Language: Английский

Citations

Efficiency in Language Understanding and Generation: An Evaluation of Four Open-Source Large Language Models DOI

Siu Ming Wong, Ho-fung Leung,

Ka Yan Wong

et al.

Research Square (Research Square), Journal Year: 2024, Volume and Issue: unknown

Published: March 11, 2024

Abstract This study provides a comprehensive evaluation of the efficiency Large Language Models (LLMs) in performing diverse language understanding and generation tasks. Through systematic comparison open-source models including GPT-Neo, Bloom, FLAN-T5, Mistral-7B, research explores their performance across widely recognized benchmarks such as GLUE, SuperGLUE, LAMBADA, SQuAD. Our findings reveal significant variations model accuracy, computational efficiency, scalability, adaptability, underscoring influence architecture training paradigms on outcomes. The identifies key factors contributing to models' offers insights into potential optimization strategies for enhancing applicability real-world NLP applications. By highlighting strengths limitations current LLMs, this contributes ongoing development more effective, efficient, adaptable models, paving way future advancements field natural processing.

Language: Английский

Citations

The rise and potential of large language model based agents: a survey DOI

Zhiheng Xi,

Wen-Xiang Chen, Xin Hua Guo

et al.

Science China Information Sciences, Journal Year: 2025, Volume and Issue: 68(2)

Published: Jan. 17, 2025

Language: Английский

Citations

Using Large Language Models to Better Detect and Handle Software Vulnerabilities and Cyber Security Threats DOI

Seyed Mohammad Taghavi,

Farid Feyzi

Research Square (Research Square), Journal Year: 2024, Volume and Issue: unknown

Published: May 21, 2024

Abstract Large Language Models (LLMs) have emerged as powerful tools in the domain of software vulnerability and cybersecurity tasks, offering promising capabilities detecting handling security threats. This article explores utilization LLMs various aspects cybersecurity, including detection, threat prediction, automated code repair. We explain concept LLMs, highlighting their applications, evaluates effectiveness challenges through literature review. explore across different domains, showcasing proficiency tasks like malware detection summarization. Comparing to traditional methods, our work highlights superior performance identifying vulnerabilities proposing fixes. Furthermore, we outline workflow LLM models, emphasizing integration into cyber frameworks incident response systems. also discuss complementary methods that enhance LLMs' capabilities, static dynamic analyzers. Additionally, synthesize findings from previous research, demonstrating how has significantly enhanced productivity addressing Finally, study offers insights optimizing implementation based on lessons learned existing literature.

Language: Английский

Citations

Efficiently Updating Domain Knowledge in Large Language Models: Techniques for Knowledge Injection without Comprehensive Retraining DOI

Emily Czekalski,

D.C. Watson

Research Square (Research Square), Journal Year: 2024, Volume and Issue: unknown

Published: June 6, 2024

Abstract Recent advancements in natural language processing have highlighted the critical importance of efficiently updating pre-trained models with domain-specific knowledge. Traditional methods requiring comprehensive retraining are resource-intensive and impractical for many applications. The proposed techniques knowledge injection, including integration adapter layers, retrieval-augmented generation (RAG), distillation, offer a novel significant solution to this challenge by enabling efficient updates without extensive retraining. Adapter layers allow specialized fine-tuning, preserving model's original capabilities while incorporating new information. RAG enhances contextual relevance generated responses dynamically retrieving pertinent information from base. Knowledge distillation transfers smaller larger model, augmenting its performance domains. Experimental results demonstrated substantial improvements accuracy, precision, recall, F1-score, along enhanced coherence. findings demonstrate potential maintain accuracy dynamic, information-rich environments, making them particularly useful fields timely accurate

Language: Английский

Citations

Integrating Deep Learning with Symbolic Reasoning in TinyLlama for Accurate Information Retrieval DOI

Xingyu Xiong, Mingliang Zheng

Research Square (Research Square), Journal Year: 2024, Volume and Issue: unknown

Published: Jan. 23, 2024

Abstract This study presents a novel approach to enhancing information retrieval capabilities in Large Language Models (LLMs) by integrating deep learning with symbolic reasoning, specifically the TinyLlama model. The research addresses inherent limitations of LLMs processing contextually complex queries and ensuring factual accuracy. By amalgamating intuitive pattern recognition structured, rule-based logic improved model demonstrates significant elevation performance. employs BIG-bench benchmark tasks empirically validate model's enhancements accuracy, logical consistency, rule adherence. Additionally, emphasizes importance interpretability trust, positioning hybrid as more transparent reliable AI tool. findings not only showcase efficacy architecture but also pave way for future research, focusing on sophisticated cognitive functions autonomous adaptation dynamic environments. work sets precedent evolution LLMs, moving towards systems capable nuanced reasoning akin human processes.

Language: Английский

Citations

Graph Neural Prompting with Large Language Models DOI

Yijun Tian, Huan Song, Zichen Wang

et al.

Proceedings of the AAAI Conference on Artificial Intelligence, Journal Year: 2024, Volume and Issue: 38(17), P. 19080 - 19088

Published: March 24, 2024

Large language models (LLMs) have shown remarkable generalization capability with exceptional performance in various modeling tasks. However, they still exhibit inherent limitations precisely capturing and returning grounded knowledge. While existing work has explored utilizing knowledge graphs (KGs) to enhance via joint training customized model architectures, applying this LLMs is problematic owing their large number of parameters high computational cost. Therefore, how pre-trained using knowledge, e.g., retrieval-augmented generation, remains an open question. In work, we propose Graph Neural Prompting (GNP), a novel plug-and-play method assist learning beneficial from KGs. GNP encompasses designs, including standard graph neural network encoder, cross-modality pooling module, domain projector, self-supervised link prediction objective. Extensive experiments on multiple datasets demonstrate the superiority both commonsense biomedical reasoning tasks across different LLM sizes settings. Code available at https://github.com/meettyj/GNP.

Language: Английский

Citations

Multimodal Integration in Large Language Models: A Case Study with Mistral LLM DOI

N. Sulaiman, Farizal Hamzah

Published: April 22, 2024

This work presents significant advancements in the multimodal capabilities of Mistral 8x7B model, a large language model designed with eight experts seven billion parameters each. We introduce comprehensive modifications to its architecture, data fusion techniques, and training procedures, aimed at improving integration processing text, image, audio data. Our experimental results demonstrate that these enhancements lead superior performance across multiple modalities when compared existing benchmarks. The improved showcases enhanced accuracy, F1 scores, index, confirming ability offer more coherent contextually appropriate outputs. research not only sets new benchmarks for models but also opens up further avenues applying such real-world, diverse, dynamic environments.

Language: Английский

Citations

Measuring the Interpretability and Explainability of Model Decisions of Five Large Language Models DOI

Kaito Fujiwara,

Miyu Sasaki,

Akira Nakamura

et al.

Published: March 20, 2024

This study conducts a comprehensive analysis of the interpretability and explainability five leading Large Language Models (LLMs): TripoSR by Stability AI, Gemma-7b Google, Mistral 7B Llama-2-7b Meta, GemMoE-Beta-1 CrystalCare AI. Through methodical evaluation encompassing both qualitative quantitative benchmarks, we assess these models' capacity to make their decision-making processes understandable humans. Our findings reveal significant variability in ability provide transparent reasoning accurate, contextually relevant explanations across different contexts. Notably, demonstrated superior transparency, while excelled accuracy explanations. However, challenges maintaining consistent varying inputs need for enhanced adaptability feedback highlight areas future improvement. research underscores importance fostering trust reliability LLM applications, advocating continued advancement achieve more transparent, accountable, user-centric AI systems. Directions include development standardized methodologies interdisciplinary approaches enhance model transparency user understanding.

Language: Английский

Citations

Cross-Domain Knowledge Transfer without Retraining to Facilitating Seamless Knowledge Application in Large Language Models DOI

Jae Hoon Kim,

Hye Rin Kim

Research Square (Research Square), Journal Year: 2024, Volume and Issue: unknown

Published: April 29, 2024

Abstract Cross-domain knowledge transfer in large language models (LLMs) presents significant challenges, particularly regarding the extensive resources required for retraining. This research introduces innovative embedding adaptation and context adjustment techniques that enable LLMs to efficiently across diverse domains without need comprehensive Experimental results demonstrate improved model flexibility reduced computational demands, highlighting potential rapid deployment scalability. These findings suggest a sustainable approach deploying adaptive AI various sectors, significantly impacting future developments artificial intelligence.

Language: Английский

Citations