2021 IEEE/CVF International Conference on Computer Vision (ICCV),
Journal Year:
2023,
Volume and Issue:
unknown, P. 22442 - 22451
Published: Oct. 1, 2023
Keypoint
detection
&
descriptors
are
foundational
technologies
for
computer
vision
tasks
like
image
matching,
3D
reconstruction
and
visual
odometry.
Hand-engineered
methods
Harris
corners,
SIFT,
HOG
have
been
used
decades;
more
recently,
there
has
a
trend
to
introduce
learning
in
an
attempt
improve
key-point
detectors.
On
inspection
however,
the
results
difficult
interpret;
recent
learning-based
employ
vast
diversity
of
experimental
setups
design
choices:
empirical
often
reported
using
different
backbones,
protocols,
datasets,
types
supervisions
or
tasks.
Since
these
differences
coupled
together,
it
raises
natural
question
on
what
makes
good
learned
keypoint
detector.
In
this
work,
we
revisit
existing
detectors
by
deconstructing
their
methodologies
identifying
key
components.
We
re-design
each
component
from
first-principle
propose
Simple
Learned
Keypoints
(SiLK)
that
is
fully-differentiable,
lightweight,
flexible.
Despite
its
simplicity,
SiLK
advances
new
state-of-the-art
Detection
Repeatability
Homography
Estimation
HPatches
Point-Cloud
Registration
task
ScanNet,
achieves
competitive
performance
camera
pose
estimation
2022
Image
Matching
Challenge
ScanNet.
International Journal of Applied Earth Observation and Geoinformation,
Journal Year:
2021,
Volume and Issue:
102, P. 102456 - 102456
Published: July 27, 2021
Deep
Neural
Networks
(DNNs)
learn
representation
from
data
with
an
impressive
capability,
and
brought
important
breakthroughs
for
processing
images,
time-series,
natural
language,
audio,
video,
many
others.
In
the
remote
sensing
field,
surveys
literature
revisions
specifically
involving
DNNs
algorithms'
applications
have
been
conducted
in
attempt
to
summarize
amount
of
information
produced
its
subfields.
Recently,
Unmanned
Aerial
Vehicle
(UAV)-based
dominated
aerial
research.
However,
a
revision
that
combines
both
"deep
learning"
"UAV
sensing"
thematics
has
not
yet
conducted.
The
motivation
our
work
was
present
comprehensive
review
fundamentals
Learning
(DL)
applied
UAV-based
imagery.
We
focused
mainly
on
describing
classification
regression
techniques
used
recent
UAV-acquired
data.
For
that,
total
232
papers
published
international
scientific
journal
databases
examined.
gathered
materials
evaluated
their
characteristics
regarding
application,
sensor,
technique
used.
discuss
how
DL
presents
promising
results
potential
tasks
associated
image
Lastly,
we
project
future
perspectives,
commentating
prominent
paths
be
explored
UAV
field.
This
consisting
approach
introduce,
commentate,
state-of-the-art
algorithms
diverse
subfields
sensing,
grouping
it
environmental,
urban,
agricultural
contexts.
IEEE Transactions on Geoscience and Remote Sensing,
Journal Year:
2022,
Volume and Issue:
60, P. 1 - 15
Published: Jan. 1, 2022
Registration
for
multisensor
or
multimodal
image
pairs
with
a
large
degree
of
distortions
is
fundamental
task
many
remote
sensing
applications.
To
achieve
accurate
and
low-cost
registration,
we
propose
multiscale
framework
unsupervised
learning,
named
MU-Net.
Without
costly
ground
truth
labels,
MU-Net
directly
learns
the
end-to-end
mapping
from
to
their
transformation
parameters.
stacks
several
deep
neural
network
(DNN)
models
on
multiple
scales
generate
coarse-to-fine
registration
pipeline,
which
prevents
backpropagation
falling
into
local
extremum
resists
significant
distortions.
We
design
novel
loss
function
paradigm
based
structural
similarity,
makes
suitable
various
types
images.
compared
traditional
feature-based
area-based
methods,
as
well
supervised
other
learning
methods
optical-optical,
optical-infrared,
optical-synthetic
aperture
radar
(SAR),
optical-map
datasets.
Experimental
results
show
that
achieves
more
comprehensive
performance
between
these
geometric
radiometric
share
code
implemented
by
Pytorch
at
https://github.com/yeyuanxin110/MU-Net
.
IEEE Transactions on Geoscience and Remote Sensing,
Journal Year:
2023,
Volume and Issue:
61, P. 1 - 15
Published: Jan. 1, 2023
Identifying
feature
correspondences
between
multimodal
images
is
facing
enormous
challenges
because
of
the
significant
differences
both
in
radiation
and
geometry.
To
address
these
problems,
we
propose
a
novel
matching
method
(named
R
2
FD
)
that
robust
to
rotation
differences,
which
consists
repeatable
detector
rotation-invariant
descriptor.
In
first
stage,
called
Multi-channel
Auto-correlation
Log-Gabor
(MALG)
presented
for
detection,
combines
multi-channel
auto-correlation
strategy
with
wavelets
detect
interest
points
(IPs)
high
repeatability
uniform
distribution.
second
descriptor
constructed,
named
Rotation-invariant
Maximum
index
map
(RMLG),
includes
fast
assignment
dominant
orientation
construction
representation.
process
orientation,
Index
Map
(RMIM)
built
deformations.
Then,
proposed
RMLG
incorporates
RMIM
spatial
configuration
DAISY
improve
RMLG's
resistance
variances.
Finally,
conduct
experiments
validate
performance
our
utilizing
different
types
image
datasets.
Experimental
results
show
outperforms
five
state-of-the-art
methods.
Moreover,
achieves
accuracy
within
two
pixels
has
great
advantage
efficiency
over
contrastive
IEEE Transactions on Pattern Analysis and Machine Intelligence,
Journal Year:
2023,
Volume and Issue:
45(10), P. 12148 - 12166
Published: June 7, 2023
Existing
image
fusion
methods
are
typically
limited
to
aligned
source
images
and
have
"tolerate"
parallaxes
when
unaligned.
Simultaneously,
the
large
variances
between
different
modalities
pose
a
significant
challenge
for
multi-modal
registration.
This
study
proposes
novel
method
called
MURF,
where
first
time,
registration
mutually
reinforced
rather
than
being
treated
as
separate
issues.
MURF
leverages
three
modules:
shared
information
extraction
module
(SIEM),
multi-scale
coarse
(MCRM),
fine
(F2M).
The
is
carried
out
in
coarse-to-fine
manner.
During
registration,
SIEM
transforms
into
mono-modal
eliminate
modal
variances.
Then,
MCRM
progressively
corrects
global
rigid
parallaxes.
Subsequently,
repair
local
non-rigid
offsets
uniformly
implemented
F2M.
fused
provides
feedback
improve
accuracy,
improved
result
further
improves
result.
For
fusion,
solely
preserving
original
existing
methods,
we
attempt
incorporate
texture
enhancement
fusion.
We
test
on
four
types
of
data
(RGB-IR,
RGB-NIR,
PET-MRI,
CT-MRI).
Extensive
results
validate
superiority
universality
MURF.
IEEE Transactions on Circuits and Systems for Video Technology,
Journal Year:
2023,
Volume and Issue:
33(8), P. 3585 - 3595
Published: Jan. 16, 2023
Rigid
registration
is
a
transformation
estimation
problem
between
two
point
clouds.
The
clouds
captured
may
partially
overlap
owing
to
different
viewpoints
and
acquisition
times.
Some
previous
correspondence
matching
based
methods
utilize
an
encoder-decoder
network
carry
out
partial-to-partial
task
adopt
skip-connection
structure
convey
information
the
encoder
decoder.
However,
equally
revisiting
them
with
introduce
redundancy,
limit
feature
learning
ability
of
entire
network.
To
address
these
problems,
we
propose
skip-attention
filtering
(SACF-Net)
for
cloud
registration.
A
novel
interaction
mechanism
designed
both
low-level
geometric
high-level
context-aware
enhance
original
pointwise
map.
Additionally,
method
proposed
selectively
revisits
features
in
at
resolutions,
allowing
decoder
extract
high-quality
correspondences
within
overlapping
regions.
We
conduct
comprehensive
experiments
on
indoor
outdoor
scene
datasets,
results
show
that
SACF-Net
yields
unprecedented
performance
improvements.
IEEE Transactions on Multimedia,
Journal Year:
2023,
Volume and Issue:
26, P. 313 - 325
Published: April 5, 2023
Humans
tend
to
mine
objects
by
learning
from
a
group
of
images
or
several
frames
video
since
we
live
in
dynamic
world.
In
the
computer
vision
area,
many
researchers
focus
on
co-segmentation
(CoS),
co-saliency
detection
(CoSD)
and
salient
object
(VSOD)
discover
co-occurrent
objects.
However,
previous
approaches
design
different
networks
for
these
similar
tasks
separately,
they
are
difficult
apply
each
other.
Besides,
fail
take
full
advantage
cues
among
inter-
intra-feature
within
images.
this
paper,
introduce
unified
framework
tackle
issues
view,
term
as
UFGS
(
xmlns:xlink="http://www.w3.org/1999/xlink">U
nified
xmlns:xlink="http://www.w3.org/1999/xlink">F
ramework
xmlns:xlink="http://www.w3.org/1999/xlink">G
roup-based
xmlns:xlink="http://www.w3.org/1999/xlink">S
egmentation).
Specifically,
first
transformer
block,
which
views
image
feature
patch
token
then
captures
their
long-range
dependencies
through
self-attention
mechanism.
This
can
help
network
excavate
patch-structured
similarities
relevant
Furthermore,
propose
an
intra-MLP
module
produce
self-mask
enhance
avoid
partial
activation.
Extensive
experiments
four
CoS
benchmarks
(PASCAL,
iCoseg
Internet
MSRC),
three
CoSD
(Cosal2015,
CoSOD3k,
CocA)
five
VSOD
(DAVIS
$_{16}$
,
FBMS,
ViSal,
SegV2,
DAVSOD)
show
that
our
method
outperforms
other
state-of-the-arts
both
accuracy
speed
using
same
architecture,
reach
140
FPS
real-time.