PLoS ONE,
Journal Year:
2025,
Volume and Issue:
20(3), P. e0318644 - e0318644
Published: March 13, 2025
The
water
levels
associated
with
mountain
floods
exhibit
rapid
fluctuations
within
small
watersheds,
necessitating
extensive
data
on
various
factors
influencing
such
disasters
to
facilitate
real-time
forecasting.
This
study
investigates
the
application
of
Long
Short-Term
Memory
(LSTM)
networks
for
flood
forecasting,
designing
a
watershed-internal
Knowledge
Graph
(KG)
and
Large
Language
Model
(LLM)
that
encompass
watershed
relationships
internal
information
structures.
We
have
developed
hydrological
KG
Qixi
Reservoir
Qiaodongcun
forecasting
points
located
in
Zhejiang
Province,
China,
systematically
organize
conservancy
data,
identify
significant
disaster-related
factors,
optimize
input
determine
most
effective
combination
levels.
Additionally,
we
implemented
Recurrent
Neural
Networks
(RNN)
Gated
Units
(GRU)
comparative
analysis
LSTM.
findings
indicate
LSTM
model,
when
integrated
LLM,
can
effectively
incorporate
critical
elements
level
changes,
accuracy
LLM-KG-LSTM
model
is
enhanced
by
3%
compared
standard
series
outperforms
both
RNN
GRU
models,
Our
method
will
guide
future
research
from
perspective
focusing
algorithms
relationship
between
multi-dimensional
disaster
algorithm
parallelism.
Proceedings of the AAAI Conference on Artificial Intelligence,
Journal Year:
2024,
Volume and Issue:
38(9), P. 10645 - 10653
Published: March 24, 2024
GraIL
and
its
variants
have
shown
their
promising
capacities
for
inductive
relation
reasoning
on
knowledge
graphs.
However,
the
uni-directional
message-passing
mechanism
hinders
such
models
from
exploiting
hidden
mutual
relations
between
entities
in
directed
Besides,
enclosing
subgraph
extraction
most
GraIL-based
restricts
model
extracting
enough
discriminative
information
reasoning.
Consequently,
expressive
ability
of
these
is
limited.
To
address
problems,
we
propose
a
novel
framework,
termed
MINES,
by
introducing
Message
Intercommunication
Neighbor-Enhanced
Subgraph.
Concretely,
message
intercommunication
designed
to
capture
omitted
information.
It
introduces
bi-directed
interactions
connected
inserting
an
undirected/bi-directed
GCN
layer
uni-directed
RGCN
layers.
Moreover,
inspired
success
involving
more
neighbors
other
graph-based
tasks,
extend
neighborhood
area
beyond
enhance
collection
Extensive
experiments
prove
capacity
proposed
MINES
various
aspects,
especially
superiority,
effectiveness,
transfer
ability.
Research Square (Research Square),
Journal Year:
2024,
Volume and Issue:
unknown
Published: March 11, 2024
Abstract
This
study
provides
a
comprehensive
evaluation
of
the
efficiency
Large
Language
Models
(LLMs)
in
performing
diverse
language
understanding
and
generation
tasks.
Through
systematic
comparison
open-source
models
including
GPT-Neo,
Bloom,
FLAN-T5,
Mistral-7B,
research
explores
their
performance
across
widely
recognized
benchmarks
such
as
GLUE,
SuperGLUE,
LAMBADA,
SQuAD.
Our
findings
reveal
significant
variations
model
accuracy,
computational
efficiency,
scalability,
adaptability,
underscoring
influence
architecture
training
paradigms
on
outcomes.
The
identifies
key
factors
contributing
to
models'
offers
insights
into
potential
optimization
strategies
for
enhancing
applicability
real-world
NLP
applications.
By
highlighting
strengths
limitations
current
LLMs,
this
contributes
ongoing
development
more
effective,
efficient,
adaptable
models,
paving
way
future
advancements
field
natural
processing.
Research Square (Research Square),
Journal Year:
2024,
Volume and Issue:
unknown
Published: May 21, 2024
Abstract
Large
Language
Models
(LLMs)
have
emerged
as
powerful
tools
in
the
domain
of
software
vulnerability
and
cybersecurity
tasks,
offering
promising
capabilities
detecting
handling
security
threats.
This
article
explores
utilization
LLMs
various
aspects
cybersecurity,
including
detection,
threat
prediction,
automated
code
repair.
We
explain
concept
LLMs,
highlighting
their
applications,
evaluates
effectiveness
challenges
through
literature
review.
explore
across
different
domains,
showcasing
proficiency
tasks
like
malware
detection
summarization.
Comparing
to
traditional
methods,
our
work
highlights
superior
performance
identifying
vulnerabilities
proposing
fixes.
Furthermore,
we
outline
workflow
LLM
models,
emphasizing
integration
into
cyber
frameworks
incident
response
systems.
also
discuss
complementary
methods
that
enhance
LLMs'
capabilities,
static
dynamic
analyzers.
Additionally,
synthesize
findings
from
previous
research,
demonstrating
how
has
significantly
enhanced
productivity
addressing
Finally,
study
offers
insights
optimizing
implementation
based
on
lessons
learned
existing
literature.
Research Square (Research Square),
Journal Year:
2024,
Volume and Issue:
unknown
Published: June 6, 2024
Abstract
Recent
advancements
in
natural
language
processing
have
highlighted
the
critical
importance
of
efficiently
updating
pre-trained
models
with
domain-specific
knowledge.
Traditional
methods
requiring
comprehensive
retraining
are
resource-intensive
and
impractical
for
many
applications.
The
proposed
techniques
knowledge
injection,
including
integration
adapter
layers,
retrieval-augmented
generation
(RAG),
distillation,
offer
a
novel
significant
solution
to
this
challenge
by
enabling
efficient
updates
without
extensive
retraining.
Adapter
layers
allow
specialized
fine-tuning,
preserving
model's
original
capabilities
while
incorporating
new
information.
RAG
enhances
contextual
relevance
generated
responses
dynamically
retrieving
pertinent
information
from
base.
Knowledge
distillation
transfers
smaller
larger
model,
augmenting
its
performance
domains.
Experimental
results
demonstrated
substantial
improvements
accuracy,
precision,
recall,
F1-score,
along
enhanced
coherence.
findings
demonstrate
potential
maintain
accuracy
dynamic,
information-rich
environments,
making
them
particularly
useful
fields
timely
accurate
Research Square (Research Square),
Journal Year:
2024,
Volume and Issue:
unknown
Published: Jan. 23, 2024
Abstract
This
study
presents
a
novel
approach
to
enhancing
information
retrieval
capabilities
in
Large
Language
Models
(LLMs)
by
integrating
deep
learning
with
symbolic
reasoning,
specifically
the
TinyLlama
model.
The
research
addresses
inherent
limitations
of
LLMs
processing
contextually
complex
queries
and
ensuring
factual
accuracy.
By
amalgamating
intuitive
pattern
recognition
structured,
rule-based
logic
improved
model
demonstrates
significant
elevation
performance.
employs
BIG-bench
benchmark
tasks
empirically
validate
model's
enhancements
accuracy,
logical
consistency,
rule
adherence.
Additionally,
emphasizes
importance
interpretability
trust,
positioning
hybrid
as
more
transparent
reliable
AI
tool.
findings
not
only
showcase
efficacy
architecture
but
also
pave
way
for
future
research,
focusing
on
sophisticated
cognitive
functions
autonomous
adaptation
dynamic
environments.
work
sets
precedent
evolution
LLMs,
moving
towards
systems
capable
nuanced
reasoning
akin
human
processes.
Proceedings of the AAAI Conference on Artificial Intelligence,
Journal Year:
2024,
Volume and Issue:
38(17), P. 19080 - 19088
Published: March 24, 2024
Large
language
models
(LLMs)
have
shown
remarkable
generalization
capability
with
exceptional
performance
in
various
modeling
tasks.
However,
they
still
exhibit
inherent
limitations
precisely
capturing
and
returning
grounded
knowledge.
While
existing
work
has
explored
utilizing
knowledge
graphs
(KGs)
to
enhance
via
joint
training
customized
model
architectures,
applying
this
LLMs
is
problematic
owing
their
large
number
of
parameters
high
computational
cost.
Therefore,
how
pre-trained
using
knowledge,
e.g.,
retrieval-augmented
generation,
remains
an
open
question.
In
work,
we
propose
Graph
Neural
Prompting
(GNP),
a
novel
plug-and-play
method
assist
learning
beneficial
from
KGs.
GNP
encompasses
designs,
including
standard
graph
neural
network
encoder,
cross-modality
pooling
module,
domain
projector,
self-supervised
link
prediction
objective.
Extensive
experiments
on
multiple
datasets
demonstrate
the
superiority
both
commonsense
biomedical
reasoning
tasks
across
different
LLM
sizes
settings.
Code
available
at
https://github.com/meettyj/GNP.
This
work
presents
significant
advancements
in
the
multimodal
capabilities
of
Mistral
8x7B
model,
a
large
language
model
designed
with
eight
experts
seven
billion
parameters
each.
We
introduce
comprehensive
modifications
to
its
architecture,
data
fusion
techniques,
and
training
procedures,
aimed
at
improving
integration
processing
text,
image,
audio
data.
Our
experimental
results
demonstrate
that
these
enhancements
lead
superior
performance
across
multiple
modalities
when
compared
existing
benchmarks.
The
improved
showcases
enhanced
accuracy,
F1
scores,
index,
confirming
ability
offer
more
coherent
contextually
appropriate
outputs.
research
not
only
sets
new
benchmarks
for
models
but
also
opens
up
further
avenues
applying
such
real-world,
diverse,
dynamic
environments.
This
study
conducts
a
comprehensive
analysis
of
the
interpretability
and
explainability
five
leading
Large
Language
Models
(LLMs):
TripoSR
by
Stability
AI,
Gemma-7b
Google,
Mistral
7B
Llama-2-7b
Meta,
GemMoE-Beta-1
CrystalCare
AI.
Through
methodical
evaluation
encompassing
both
qualitative
quantitative
benchmarks,
we
assess
these
models'
capacity
to
make
their
decision-making
processes
understandable
humans.
Our
findings
reveal
significant
variability
in
ability
provide
transparent
reasoning
accurate,
contextually
relevant
explanations
across
different
contexts.
Notably,
demonstrated
superior
transparency,
while
excelled
accuracy
explanations.
However,
challenges
maintaining
consistent
varying
inputs
need
for
enhanced
adaptability
feedback
highlight
areas
future
improvement.
research
underscores
importance
fostering
trust
reliability
LLM
applications,
advocating
continued
advancement
achieve
more
transparent,
accountable,
user-centric
AI
systems.
Directions
include
development
standardized
methodologies
interdisciplinary
approaches
enhance
model
transparency
user
understanding.
Research Square (Research Square),
Journal Year:
2024,
Volume and Issue:
unknown
Published: April 29, 2024
Abstract
Cross-domain
knowledge
transfer
in
large
language
models
(LLMs)
presents
significant
challenges,
particularly
regarding
the
extensive
resources
required
for
retraining.
This
research
introduces
innovative
embedding
adaptation
and
context
adjustment
techniques
that
enable
LLMs
to
efficiently
across
diverse
domains
without
need
comprehensive
Experimental
results
demonstrate
improved
model
flexibility
reduced
computational
demands,
highlighting
potential
rapid
deployment
scalability.
These
findings
suggest
a
sustainable
approach
deploying
adaptive
AI
various
sectors,
significantly
impacting
future
developments
artificial
intelligence.