Large
language
models
have
emergent
capabilities
that
come
unexpectedly
at
scale,
but
we
need
a
theoretical
framework
to
explain
why
and
how
they
emerge.
We
prove
are
actually
non-ergodic
systems
while
providing
mathematical
based
on
Stuart
Kauffman's
theory
of
the
adjacent
possible
(TAP)
capability
emergence.
Our
resource-constrained
TAP
equation
demonstrates
architectural,
training,
contextual
constraints
interact
shape
model
through
phase
transitions
in
semantic
space.
experiments
with
three
different
capacities
emerge
discrete
guided
by
constraint
interactions
path-dependent
exploration.
This
provides
basis
for
understanding
emergence
guides
development
architectures
can
guide
IEEE Access,
Journal Year:
2024,
Volume and Issue:
12, P. 101603 - 101625
Published: Jan. 1, 2024
Autonomous
driving
has
achieved
significant
milestones
in
research
and
development
over
the
last
two
decades.
There
is
increasing
interest
field
as
deployment
of
autonomous
vehicles
(AVs)
promises
safer
more
ecologically
friendly
transportation
systems.
With
rapid
progress
computationally
powerful
artificial
intelligence
(AI)
techniques,
AVs
can
sense
their
environment
with
high
precision,
make
safe
real-time
decisions,
operate
reliably
without
human
intervention.
However,
intelligent
decision-making
such
not
generally
understandable
by
humans
current
state
art,
deficiency
hinders
this
technology
from
being
socially
acceptable.
Hence,
aside
making
must
also
explain
AI-guided
process
order
to
be
regulatory-compliant
across
many
jurisdictions.
Our
study
sheds
comprehensive
light
on
explainable
(XAI)
approaches
for
AVs.
In
particular,
we
following
contributions.
First,
provide
a
thorough
overview
state-of-the-art
emerging
XAI-based
driving.
We
then
propose
conceptual
framework
considering
essential
elements
end-to-end
Finally,
present
prospective
directions
paradigms
future
that
hold
promise
enhancing
transparency,
trustworthiness,
societal
acceptance
IEEE Access,
Journal Year:
2024,
Volume and Issue:
12, P. 41180 - 41218
Published: Jan. 1, 2024
In
today's
digital
age,
Convolutional
Neural
Networks
(CNNs),
a
subset
of
Deep
Learning
(DL),
are
widely
used
for
various
computer
vision
tasks
such
as
image
classification,
object
detection,
and
segmentation.
There
numerous
types
CNNs
designed
to
meet
specific
needs
requirements,
including
1D,
2D,
3D
CNNs,
well
dilated,
grouped,
attention,
depthwise
convolutions,
NAS,
among
others.
Each
type
CNN
has
its
unique
structure
characteristics,
making
it
suitable
tasks.
It's
crucial
gain
thorough
understanding
perform
comparative
analysis
these
different
understand
their
strengths
weaknesses.
Furthermore,
studying
the
performance,
limitations,
practical
applications
each
can
aid
in
development
new
improved
architectures
future.
We
also
dive
into
platforms
frameworks
that
researchers
utilize
research
or
from
perspectives.
Additionally,
we
explore
main
fields
like
6D
vision,
generative
models,
meta-learning.
This
survey
paper
provides
comprehensive
examination
comparison
architectures,
highlighting
architectural
differences
emphasizing
respective
advantages,
disadvantages,
applications,
challenges,
future
trends.
Transactions of the Association for Computational Linguistics,
Journal Year:
2024,
Volume and Issue:
12, P. 484 - 506
Published: Jan. 1, 2024
Abstract
While
large
language
models
(LLMs)
have
shown
remarkable
effectiveness
in
various
NLP
tasks,
they
are
still
prone
to
issues
such
as
hallucination,
unfaithful
reasoning,
and
toxicity.
A
promising
approach
rectify
these
flaws
is
correcting
LLMs
with
feedback,
where
the
LLM
itself
prompted
or
guided
feedback
fix
problems
its
own
output.
Techniques
leveraging
automated
feedback—either
produced
by
(self-correction)
some
external
system—are
of
particular
interest
make
LLM-based
solutions
more
practical
deployable
minimal
human
intervention.
This
paper
provides
an
exhaustive
review
recent
advances
categorizing
them
into
training-time,
generation-time,
post-hoc
approaches.
We
also
identify
potential
challenges
future
directions
this
emerging
field.
ACM Computing Surveys,
Journal Year:
2025,
Volume and Issue:
unknown
Published: Feb. 13, 2025
As
the
applications
of
large
language
models
(LLMs)
expand
across
diverse
fields,
their
ability
to
adapt
ongoing
changes
in
data,
tasks,
and
user
preferences
becomes
crucial.
Traditional
training
methods
with
static
datasets
are
inadequate
for
coping
dynamic
nature
real-world
information.
Lifelong
learning,
or
continual
addresses
this
by
enabling
LLMs
learn
continuously
over
operational
lifetime,
integrating
new
knowledge
while
retaining
previously
learned
information
preventing
catastrophic
forgetting.
Our
survey
explores
landscape
lifelong
categorizing
strategies
into
two
groups
based
on
how
is
integrated:
Internal
Knowledge,
where
absorb
parameters
through
full
partial
training,
External
which
incorporates
as
external
resources
like
Wikipedia
APIs
without
updating
model
parameters.
The
key
contributions
our
include:
(1)
Introducing
a
novel
taxonomy
categorize
extensive
literature
learning
12
scenarios;
(2)
Identifying
common
techniques
all
scenarios
classifying
existing
various
technique
groups;
(3)
Highlighting
emerging
such
expansion
data
selection,
were
less
explored
pre-LLM
era.
Resources
available
at
https://github.com/qianlima-lab/awesome-lifelong-learning-methods-for-llm.
2021 IEEE/CVF International Conference on Computer Vision (ICCV),
Journal Year:
2023,
Volume and Issue:
unknown, P. 19091 - 19101
Published: Oct. 1, 2023
The
goal
of
continual
learning
is
to
improve
the
performance
recognition
models
in
sequentially
arrived
data.
Although
most
existing
works
are
established
on
premise
from
scratch,
growing
efforts
have
been
devoted
incorporating
benefits
pre-training.
However,
how
adaptively
exploit
pre-trained
knowledge
for
each
incremental
task
while
maintaining
its
generalizability
remains
an
open
question.
In
this
work,
we
present
extensive
analysis
a
model
(CLPM),
and
attribute
key
challenge
progressive
overfitting
problem.
Observing
that
selectively
reducing
rate
can
almost
resolve
issue
representation
layer,
propose
simple
but
extremely
effective
approach
named
Slow
Learner
with
Classifier
Alignment
(SLCA),
which
further
improves
classification
layer
by
modeling
class-wise
distributions
aligning
layers
post-hoc
fashion.
Across
variety
scenarios,
our
proposal
provides
substantial
improvements
CLPM
(e.g.,
up
49.76%,
50.05%,
44.69%
40.16%
Split
CIFAR-100,
ImageNet-R,
CUB-200
Cars-196,
respectively),
thus
outperforms
state-of-the-art
approaches
large
margin.
Based
such
strong
baseline,
critical
factors
promising
directions
analyzed
in-depth
facilitate
subsequent
research.
Code
has
made
available
at:
https://github.com/GengDavid/SLCA.
IEEE Transactions on Pattern Analysis and Machine Intelligence,
Journal Year:
2024,
Volume and Issue:
47(3), P. 1464 - 1483
Published: Nov. 14, 2024
Forgetting
refers
to
the
loss
or
deterioration
of
previously
acquired
knowledge.
While
existing
surveys
on
forgetting
have
primarily
focused
continual
learning,
is
a
prevalent
phenomenon
observed
in
various
other
research
domains
within
deep
learning.
manifests
fields
such
as
generative
models
due
generator
shifts,
and
federated
learning
heterogeneous
data
distributions
across
clients.
Addressing
encompasses
several
challenges,
including
balancing
retention
old
task
knowledge
with
fast
new
task,
managing
interference
conflicting
goals,
preventing
privacy
leakage,
etc.
Moreover,
most
implicitly
assume
that
always
harmful.
In
contrast,
our
survey
argues
double-edged
sword
can
be
beneficial
desirable
certain
cases,
privacy-preserving
scenarios.
By
exploring
broader
context,
we
present
more
nuanced
understanding
this
highlight
its
potential
advantages.
Through
comprehensive
survey,
aspire
uncover
solutions
by
drawing
upon
ideas
approaches
from
dealt
forgetting.
examining
beyond
conventional
boundaries,
hope
encourage
development
novel
strategies
for
mitigating,
harnessing,
even
embracing
real
applications.
Nature Communications,
Journal Year:
2025,
Volume and Issue:
16(1)
Published: Feb. 2, 2025
Current
artificial
systems
suffer
from
catastrophic
forgetting
during
continual
learning,
a
limitation
absent
in
biological
systems.
Biological
mechanisms
leverage
the
dual
representation
of
specific
and
generalized
memories
within
corticohippocampal
circuits
to
facilitate
lifelong
learning.
Inspired
by
this,
we
develop
circuits-based
hybrid
neural
network
(CH-HNN)
that
emulates
these
representations,
significantly
mitigating
both
task-incremental
class-incremental
learning
scenarios.
Our
CH-HNNs
incorporate
networks
spiking
networks,
leveraging
prior
knowledge
new
concept
through
episode
inference,
offering
insights
into
functions
feedforward
feedback
loops
circuits.
Crucially,
CH-HNN
operates
as
task-agnostic
system
without
increasing
memory
demands,
demonstrating
adaptability
robustness
real-world
applications.
Coupled
with
low
power
consumption
inherent
SNNs,
our
model
represents
potential
for
energy-efficient,
dynamic
environments.
Big Data and Cognitive Computing,
Journal Year:
2025,
Volume and Issue:
9(3), P. 50 - 50
Published: Feb. 20, 2025
Large
language
models
(LLMs)
have
demonstrated
remarkable
capabilities
in
text
generation,
which
also
raise
numerous
concerns
about
their
potential
misuse,
especially
educational
exercises
and
academic
writing.
Accurately
identifying
tracing
the
origins
of
LLM-generated
content
is
crucial
for
accountability
transparency,
ensuring
responsible
use
LLMs
environments.
Previous
methods
utilize
binary
classifiers
to
discriminate
whether
a
piece
was
written
by
human
or
generated
specific
LLM
employ
multi-class
trace
source
from
fixed
set.
These
methods,
however,
are
restricted
one
several
pre-specified
cannot
generalize
new
LLMs,
continually
emerging.
This
study
formulates
class-incremental
learning
(CIL)
fashion,
where
emerge,
model
incrementally
learns
identify
without
forgetting
old
ones.
A
training-free
continual
method
further
devised
task,
idea
extract
prototypes
emerging
using
frozen
encoder,
then
perform
origin
via
prototype
matching
after
delicate
decorrelation
process.
For
evaluation,
two
datasets
constructed,
English
Chinese.
simulate
scenario
six
emerge
over
time
used
generate
student
essays,
an
detector
has
expand
its
recognition
scope
as
appear.
Experimental
results
show
that
proposed
achieves
average
accuracy
97.04%
on
dataset
91.23%
Chinese
dataset.
validate
feasibility
verify
effectiveness
detecting
cheating
coursework.