SSRN Electronic Journal, Год журнала: 2024, Номер unknown
Опубликована: Янв. 1, 2024
Язык: Английский
SSRN Electronic Journal, Год журнала: 2024, Номер unknown
Опубликована: Янв. 1, 2024
Язык: Английский
Proceedings of the AAAI Conference on Artificial Intelligence, Год журнала: 2024, Номер 38(9), С. 10645 - 10653
Опубликована: Март 24, 2024
GraIL and its variants have shown their promising capacities for inductive relation reasoning on knowledge graphs. However, the uni-directional message-passing mechanism hinders such models from exploiting hidden mutual relations between entities in directed Besides, enclosing subgraph extraction most GraIL-based restricts model extracting enough discriminative information reasoning. Consequently, expressive ability of these is limited. To address problems, we propose a novel framework, termed MINES, by introducing Message Intercommunication Neighbor-Enhanced Subgraph. Concretely, message intercommunication designed to capture omitted information. It introduces bi-directed interactions connected inserting an undirected/bi-directed GCN layer uni-directed RGCN layers. Moreover, inspired success involving more neighbors other graph-based tasks, extend neighborhood area beyond enhance collection Extensive experiments prove capacity proposed MINES various aspects, especially superiority, effectiveness, transfer ability.
Язык: Английский
Процитировано
19Science China Information Sciences, Год журнала: 2025, Номер 68(2)
Опубликована: Янв. 17, 2025
Язык: Английский
Процитировано
19Research Square (Research Square), Год журнала: 2024, Номер unknown
Опубликована: Март 11, 2024
Abstract This study provides a comprehensive evaluation of the efficiency Large Language Models (LLMs) in performing diverse language understanding and generation tasks. Through systematic comparison open-source models including GPT-Neo, Bloom, FLAN-T5, Mistral-7B, research explores their performance across widely recognized benchmarks such as GLUE, SuperGLUE, LAMBADA, SQuAD. Our findings reveal significant variations model accuracy, computational efficiency, scalability, adaptability, underscoring influence architecture training paradigms on outcomes. The identifies key factors contributing to models' offers insights into potential optimization strategies for enhancing applicability real-world NLP applications. By highlighting strengths limitations current LLMs, this contributes ongoing development more effective, efficient, adaptable models, paving way future advancements field natural processing.
Язык: Английский
Процитировано
18Research Square (Research Square), Год журнала: 2024, Номер unknown
Опубликована: Май 21, 2024
Язык: Английский
Процитировано
17Research Square (Research Square), Год журнала: 2024, Номер unknown
Опубликована: Июнь 6, 2024
Язык: Английский
Процитировано
16Research Square (Research Square), Год журнала: 2024, Номер unknown
Опубликована: Янв. 23, 2024
Abstract This study presents a novel approach to enhancing information retrieval capabilities in Large Language Models (LLMs) by integrating deep learning with symbolic reasoning, specifically the TinyLlama model. The research addresses inherent limitations of LLMs processing contextually complex queries and ensuring factual accuracy. By amalgamating intuitive pattern recognition structured, rule-based logic improved model demonstrates significant elevation performance. employs BIG-bench benchmark tasks empirically validate model's enhancements accuracy, logical consistency, rule adherence. Additionally, emphasizes importance interpretability trust, positioning hybrid as more transparent reliable AI tool. findings not only showcase efficacy architecture but also pave way for future research, focusing on sophisticated cognitive functions autonomous adaptation dynamic environments. work sets precedent evolution LLMs, moving towards systems capable nuanced reasoning akin human processes.
Язык: Английский
Процитировано
14Proceedings of the AAAI Conference on Artificial Intelligence, Год журнала: 2024, Номер 38(17), С. 19080 - 19088
Опубликована: Март 24, 2024
Large language models (LLMs) have shown remarkable generalization capability with exceptional performance in various modeling tasks. However, they still exhibit inherent limitations precisely capturing and returning grounded knowledge. While existing work has explored utilizing knowledge graphs (KGs) to enhance via joint training customized model architectures, applying this LLMs is problematic owing their large number of parameters high computational cost. Therefore, how pre-trained using knowledge, e.g., retrieval-augmented generation, remains an open question. In work, we propose Graph Neural Prompting (GNP), a novel plug-and-play method assist learning beneficial from KGs. GNP encompasses designs, including standard graph neural network encoder, cross-modality pooling module, domain projector, self-supervised link prediction objective. Extensive experiments on multiple datasets demonstrate the superiority both commonsense biomedical reasoning tasks across different LLM sizes settings. Code available at https://github.com/meettyj/GNP.
Язык: Английский
Процитировано
14Опубликована: Апрель 22, 2024
This work presents significant advancements in the multimodal capabilities of Mistral 8x7B model, a large language model designed with eight experts seven billion parameters each. We introduce comprehensive modifications to its architecture, data fusion techniques, and training procedures, aimed at improving integration processing text, image, audio data. Our experimental results demonstrate that these enhancements lead superior performance across multiple modalities when compared existing benchmarks. The improved showcases enhanced accuracy, F1 scores, index, confirming ability offer more coherent contextually appropriate outputs. research not only sets new benchmarks for models but also opens up further avenues applying such real-world, diverse, dynamic environments.
Язык: Английский
Процитировано
14Опубликована: Март 20, 2024
This study conducts a comprehensive analysis of the interpretability and explainability five leading Large Language Models (LLMs): TripoSR by Stability AI, Gemma-7b Google, Mistral 7B Llama-2-7b Meta, GemMoE-Beta-1 CrystalCare AI. Through methodical evaluation encompassing both qualitative quantitative benchmarks, we assess these models' capacity to make their decision-making processes understandable humans. Our findings reveal significant variability in ability provide transparent reasoning accurate, contextually relevant explanations across different contexts. Notably, demonstrated superior transparency, while excelled accuracy explanations. However, challenges maintaining consistent varying inputs need for enhanced adaptability feedback highlight areas future improvement. research underscores importance fostering trust reliability LLM applications, advocating continued advancement achieve more transparent, accountable, user-centric AI systems. Directions include development standardized methodologies interdisciplinary approaches enhance model transparency user understanding.
Язык: Английский
Процитировано
14Research Square (Research Square), Год журнала: 2024, Номер unknown
Опубликована: Апрель 29, 2024
Язык: Английский
Процитировано
12