Drones,
Journal Year:
2024,
Volume and Issue:
9(1), P. 10 - 10
Published: Dec. 25, 2024
Deep
reinforcement
learning
(DRL)
has
significantly
advanced
online
path
planning
for
unmanned
aerial
vehicles
(UAVs).
Nonetheless,
DRL-based
methods
often
encounter
reduced
performance
when
dealing
with
unfamiliar
scenarios.
This
decline
is
mainly
caused
by
the
presence
of
non-causal
and
domain-specific
elements
within
visual
representations,
which
negatively
impact
policies.
Present
techniques
generally
depend
on
predefined
augmentation
or
regularization
intended
to
direct
model
toward
identifying
causal
domain-invariant
components,
thereby
enhancing
model’s
ability
generalize.
However,
these
manually
crafted
approaches
are
intrinsically
constrained
in
their
coverage
do
not
consider
entire
spectrum
possible
scenarios,
resulting
less
effective
new
environments.
Unlike
prior
studies,
this
work
prioritizes
representation
presents
a
novel
method
disentanglement.
The
approach
successfully
distinguishes
between
data.
By
concentrating
aspects
during
policy
phase,
factors
minimized,
improving
generalizability
DRL
models.
Experimental
results
demonstrate
that
our
technique
achieves
reliable
navigation
collision
avoidance
unseen
surpassing
state-of-the-art
deep
algorithms.
Aerospace,
Journal Year:
2024,
Volume and Issue:
11(11), P. 870 - 870
Published: Oct. 24, 2024
This
review
explores
the
integration
of
machine
learning
(ML)
and
reinforcement
(RL)
techniques
in
enhancing
navigation
obstacle
avoidance
capabilities
Unmanned
Aerial
Vehicles
(UAVs).
Various
RL
algorithms
are
assessed
for
their
effectiveness
teaching
UAVs
autonomous
navigation,
with
a
focus
on
state
representation
from
UAV
sensors
real-time
environmental
interaction.
The
identifies
strengths
limitations
current
methodologies
highlights
gaps
literature,
proposing
future
research
directions
to
advance
technology.
Interdisciplinary
approaches
combining
robotics,
AI,
aeronautics
suggested
improve
performance
complex
environments.
E-Learning and Digital Media,
Journal Year:
2024,
Volume and Issue:
unknown
Published: Oct. 17, 2024
Road
hazards
significantly
contribute
to
fatalities
in
traffic
accidents.
As
the
number
of
vehicles
on
road
increases,
risk
accidents
rises,
especially
under
adverse
weather
conditions
that
impair
visibility
and
conditions.
In
such
scenarios,
it
is
crucial
alert
approaching
prevent
further
collisions.
Detecting
humans
or
animals
essential
minimize
Accurate
detection
estimation
are
vital
for
ensuring
safety
enhancing
driving
experience.
Current
deep
learning
methods
condition
monitoring
often
time-consuming,
costly,
inefficient,
labor-intensive,
require
frequent
updates.
Therefore,
there
pressing
need
more
flexible,
cost-effective,
efficient
process
detect
conditions,
particularly
hazards.
this
work,
we
present
a
hazard
avoidance
system
autonomous
using
reinforcement
(DRL)
address
congestion
issues
complex
We
utilize
GoogLeNet
feature
extraction,
which
extracts
features
from
given
images.
Subsequently,
design
modified
compact
snake
optimization
(MCSO)
algorithm
optimization,
addressing
data
dimensionality
issues.
Additionally,
introduce
geometric
(GDRL)
tracking
environments,
improving
accuracy
robustness
visual
detection.
The
proposed
MCSO
+
GDRL
model
validated
self-made
open
access
dataset
with
5607
samples
car
recorders
KITTI
training.
Drones,
Journal Year:
2024,
Volume and Issue:
8(11), P. 655 - 655
Published: Nov. 8, 2024
The
game
of
pursuit–evasion
has
always
been
a
popular
research
subject
in
the
field
Unmanned
Aerial
Vehicles
(UAVs).
Current
evasion
decision
making
based
on
reinforcement
learning
is
generally
trained
only
for
specific
pursuers,
and
it
limited
performance
evading
unknown
pursuers
exhibits
poor
generalizability.
To
enhance
ability
an
policy
learned
by
(RL)
to
evade
this
paper
proposes
pursuit
UAV
attitude
estimation
strategy
identification
method
Model
Reference
Policy
Adaptation
(MRPA)
algorithm.
Firstly,
constructs
Markov
model
UAVs
that
includes
pursuer’s
trains
using
Soft
Actor–Critic
(SAC)
Secondly,
establishes
novel
relative
motion
games
under
assumption
proportional
guidance
used
as
strategy,
which
algorithm
proposed
provide
adequate
information
adaptation.
Furthermore,
presented
improve
generalizability
RL
certain
environments.
Finally,
various
numerical
simulations
imply
precision
accuracy
identification.
Also,
ablation
experiment
verifies
MRPA
can
effectively
deal
with
pursuers.
IET Radar Sonar & Navigation,
Journal Year:
2024,
Volume and Issue:
unknown
Published: Dec. 6, 2024
Abstract
Model‐free
deep
reinforcement
learning
(DRL)
is
regarded
as
an
effective
approach
for
multi‐target
cognitive
electronic
reconnaissance
(MCER)
missions.
However,
DRL
networks
with
poor
generalisation
can
significantly
reduce
mission
completion
rates
when
parameters
such
area
size,
target
number,
and
platform
speed
vary
slightly.
To
address
this
issue,
paper
introduces
a
novel
scene
reconstruction
method
MCER
missions
group
adaptive
transfer
(MTDRL)
algorithm.
The
algorithm
enables
quick
adaptation
of
strategies
varied
scenes
by
transferring
strategy
templates
compressing
perception
states.
validate
the
method,
authors
developed
model
unmanned
aerial
vehicle
(UAV)
MCER.
Three
sets
experiments
are
conducted
varying
speed.
results
show
that
MTDRL
outperforms
two
commonly
used
algorithms,
18%
increase
in
rate
5.49
h
reduction
training
time.
Furthermore,
much
higher
than
typical
non‐DRL
UAV
demonstrates
stable
hovering
repeat
behaviours
at
radar
detection
boundary,
ensuring
flight
safety
during
Drones,
Journal Year:
2024,
Volume and Issue:
8(12), P. 782 - 782
Published: Dec. 22, 2024
The
capability
of
UAVs
for
efficient
autonomous
navigation
and
obstacle
avoidance
in
complex
unknown
environments
is
critical
applications
agricultural
irrigation,
disaster
relief
logistics.
In
this
paper,
we
propose
the
DPRL
(Distributed
Privileged
Reinforcement
Learning)
algorithm,
an
end-to-end
policy
designed
to
address
challenge
high-speed
UAV
under
partially
observable
environmental
conditions.
Our
approach
combines
deep
reinforcement
learning
with
privileged
overcome
impact
observation
data
corruption
caused
by
partial
observability.
We
leverage
asymmetric
Actor–Critic
architecture
provide
agent
information
during
training,
which
enhances
model’s
perceptual
capabilities.
Additionally,
present
a
multi-agent
exploration
strategy
across
diverse
accelerate
experience
collection,
turn
expedites
model
convergence.
conducted
extensive
simulations
various
scenarios,
benchmarking
our
algorithm
against
state-of-the-art
algorithms.
results
consistently
demonstrate
superior
performance
terms
flight
efficiency,
robustness
overall
success
rate.