Sensors,
Journal Year:
2024,
Volume and Issue:
24(20), P. 6723 - 6723
Published: Oct. 19, 2024
Autonomous
navigation
systems
often
struggle
in
dynamic,
complex
environments
due
to
challenges
safety,
intent
prediction,
and
strategic
planning.
Traditional
methods
are
limited
by
rigid
architectures
inadequate
safety
mechanisms,
reducing
adaptability
unpredictable
scenarios.
We
propose
SafeMod,
a
novel
framework
enhancing
autonomous
driving
improving
decision-making
scenario
management.
SafeMod
features
bidirectional
planning
structure
with
two
components:
forward
backward
Forward
predicts
surrounding
agents'
behavior
using
text-based
environment
descriptions
reasoning
via
large
language
models,
generating
action
predictions.
These
embedded
into
transformer-based
planner
that
integrates
text
image
data
produce
feasible
trajectories.
Backward
refines
these
trajectories
policy
value
functions
learned
through
Actor-Critic-based
reinforcement
learning,
selecting
optimal
actions
based
on
probability
distributions.
Experiments
CARLA
nuScenes
benchmarks
demonstrate
outperforms
recent
both
real-world
simulation
testing,
significantly
decision-making.
This
underscores
SafeMod's
potential
effectively
integrate
considerations
driving.
2021 IEEE/CVF International Conference on Computer Vision (ICCV),
Journal Year:
2023,
Volume and Issue:
unknown, P. 11929 - 11940
Published: Oct. 1, 2023
In
this
paper,
we
propose
a
novel
task
IntentQA,
special
VideoQA
focusing
on
video
intent
reasoning,
which
has
become
increasingly
important
for
AI
with
its
advantages
in
equipping
agents
the
capability
of
reasoning
beyond
mere
recognition
daily
tasks.
We
also
contribute
large-scale
dataset
task.
Context-aware
Video
Intent
Reasoning
model
(CaVIR)
consisting
i)
Query
Language
(VQL)
better
cross-modal
representation
situational
context,
ii)
Contrastive
Learning
module
utilizing
contrastive
and
iii)
Commonsense
incorporating
commonsense
context.
Comprehensive
experiments
challenging
demonstrate
effectiveness
each
component,
superiority
our
full
over
other
baselines,
generalizability
to
new
The
codes
are
open-sourced
at:
https://github.com/JoseponLee/IntentQA.git.
Artificial
intelligence
(AI)
has
revolutionized
human
cognitive
abilities
and
facilitated
the
development
of
new
AI
entities
capable
interacting
with
humans
in
both
physical
virtual
environments.
Despite
existence
reality,
mixed
augmented
reality
for
many
years,
integrating
these
technical
fields
remains
a
formidable
challenge
due
to
their
disparate
application
directions.
The
advent
agents,
autonomous
perception
action,
further
compounds
this
issue
by
exposing
limitations
traditional
human-centered
research
approaches.
It
is
imperative
establish
comprehensive
framework
that
accommodates
dual
perceptual
centers
agents
worlds.
In
paper,
we
introduce
symmetrical
framework,
which
offers
unified
representation
encompassing
various
forms
physical-virtual
amalgamations.
This
enables
researchers
better
comprehend
how
can
collaborate
distinct
pathways
integration
be
consolidated
from
broader
perspective.
We
then
delve
into
coexistence
AI,
demonstrating
prototype
system
exemplifies
operation
systems
specific
tasks,
such
as
pouring
water.
Finally,
propose
an
instance
AI-driven
active
assistance
service
illustrates
potential
applications
reality.
paper
aims
offer
beneficial
perspectives
guidance
practitioners
different
fields,
thus
contributing
ongoing
about
human-AI