2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),
Год журнала:
2023,
Номер
25, С. 16610 - 16620
Опубликована: Июнь 1, 2023
Volumetric
scene
representations
enable
photorealistic
view
synthesis
for
static
scenes
and
form
the
basis
of
several
existing
6-DoF
video
techniques.
However,
volume
rendering
procedures
that
drive
these
necessitate
careful
trade-offs
in
terms
quality,
speed,
memory
efficiency.
In
particular,
methods
fail
to
simultaneously
achieve
real-time
performance,
small
footprint,
high-quality
challenging
real-world
scenes.
To
address
issues,
we
present
HyperReel―a
novel
representation.
The
two
core
components
HyperReel
are:
(1)
a
ray-conditioned
sample
prediction
network
enables
high-fidelity,
high
frame
rate
at
resolutions
(2)
compact
memory-efficient
dynamic
Our
pipeline
achieves
best
performance
compared
prior
contemporary
approaches
visual
quality
with
requirements,
while
also
up
18
frames-per-second
megapixel
resolution
without
any
custom
CUDA
code.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),
Год журнала:
2023,
Номер
unknown
Опубликована: Июнь 1, 2023
Neural
Radiance
Fields
(NeRFs)
have
emerged
as
a
popular
approach
for
novel
view
synthesis.
While
NeRFs
are
quickly
being
adapted
wider
set
of
applications,
intuitively
editing
NeRF
scenes
is
still
an
open
challenge.
One
important
task
the
removal
unwanted
objects
from
3D
scene,
such
that
replaced
region
visually
plausible
and
consistent
with
its
context.
We
refer
to
this
inpainting.
In
3D,
solutions
must
be
both
across
multiple
views
geometrically
valid.
paper,
we
propose
inpainting
method
addresses
these
challenges.
Given
small
posed
images
sparse
annotations
in
single
input
image,
our
framework
first
rapidly
obtains
segmentation
mask
target
object.
Using
mask,
perceptual
optimization-based
then
introduced
leverages
learned
2D
image
inpainters,
distilling
their
information
into
space,
while
ensuring
consistency.
also
address
lack
diverse
benchmark
evaluating
scene
methods
by
introducing
dataset
comprised
challenging
real-world
scenes.
particular,
contains
same
without
object,
enabling
more
principled
benchmarking
task.
demonstrate
superiority
on
multiview
segmentation,
comparing
NeRF-based
approaches.
evaluate
inpainting,
establishing
state-of-the-art
performance
against
other
manipulation
algorithms,
well
strong
inpainter
baseline.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),
Год журнала:
2023,
Номер
unknown, С. 17408 - 17419
Опубликована: Июнь 1, 2023
We
present
ESLAM,
an
efficient
implicit
neural
representation
method
for
Simultaneous
Localization
and
Mapping
(SLAM).
ESLAM
reads
RGB-D
frames
with
unknown
camera
poses
in
a
sequential
manner
incrementally
reconstructs
the
scene
while
estimating
current
position
scene.
incorporate
latest
advances
Neural
Radiance
Fields
(NeRF)
into
SLAM
system,
resulting
accurate
dense
visual
method.
Our
consists
of
multi-scale
axis-aligned
perpendicular
feature
planes
shallow
decoders
that,
each
point
continuous
space,
decode
interpolated
features
Truncated
Signed
Distance
Field
(TSDF)
RGB
values.
extensive
experiments
on
three
standard
datasets,
Replica,
ScanNet,
TUM
show
that
improves
accuracy
3D
reconstruction
localization
state-of-the-art
methods
by
more
than
50%,
it
runs
up
to
×10
faster
does
not
require
any
pre-training.
Project
page:
https://www.idiap.ch/paper/eslam.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),
Год журнала:
2023,
Номер
unknown
Опубликована: Июнь 1, 2023
We
propose
Panoptic
Lifting,
a
novel
approach
for
learning
panoptic
3D
volumetric
representations
from
images
of
in-the-wild
scenes.
Once
trained,
our
model
can
render
color
together
with
3D-consistent
segmentation
viewpoints.
Unlike
existing
approaches
which
use
input
directly
or
indirectly,
method
requires
only
machine-generated
2D
masks
inferred
pre-trained
network.
Our
core
contribution
is
lifting
scheme
based
on
neural
field
representation
that
generates
unified
and
multi-view
consistent,
the
scene.
To
account
inconsistencies
instance
identifiers
across
views,
we
solve
linear
assignment
cost
model's
current
predictions
masks,
thus
enabling
us
to
lift
instances
in
consistent
way.
further
ablate
contributions
make
more
robust
noisy,
labels,
including
test-time
augmentations
confidence
estimates,
segment
consistency
loss,
bounded
fields,
gradient
stopping.
Experimental
results
validate
challenging
Hypersim,
Replica,
ScanNet
datasets,
improving
by
8.4,
13.8,
10.6%
scene-level
PQ
over
state
art.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),
Год журнала:
2023,
Номер
unknown, С. 803 - 814
Опубликована: Июнь 1, 2023
Recent
advances
in
modeling
3D
objects
mostly
rely
on
synthetic
datasets
due
to
the
lack
of
large-scale
real-scanned
databases.
To
facilitate
development
perception,
reconstruction,
and
generation
real
world,
we
propose
OmniObject3D,
a
large
vocabulary
object
dataset
with
massive
high-quality
objects.
OmniObject3D
has
several
appealing
properties:
1)
Large
Vocabulary:
It
comprises
6,000
scanned
190
daily
categories,
sharing
common
classes
popular
2D
(e.g.,
ImageNet
LVIS),
benefiting
pursuit
generalizable
representations.
2)
Rich
Annotations:
Each
is
captured
both
sensors,
providing
textured
meshes,
point
clouds,
multi-view
rendered
images,
multiple
real-captured
videos.
3)
Realistic
Scans:
The
professional
scanners
support
scans
precise
shapes
realistic
appearances.
With
vast
exploration
space
offered
by
carefully
set
up
four
evaluation
tracks:
a)
robust
b)
novel-view
synthesis,
c)
neural
surface
d)
generation.
Extensive
studies
are
performed
these
benchmarks,
revealing
new
observations,
challenges,
opportunities
for
future
research
vision.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),
Год журнала:
2023,
Номер
25, С. 16610 - 16620
Опубликована: Июнь 1, 2023
Volumetric
scene
representations
enable
photorealistic
view
synthesis
for
static
scenes
and
form
the
basis
of
several
existing
6-DoF
video
techniques.
However,
volume
rendering
procedures
that
drive
these
necessitate
careful
trade-offs
in
terms
quality,
speed,
memory
efficiency.
In
particular,
methods
fail
to
simultaneously
achieve
real-time
performance,
small
footprint,
high-quality
challenging
real-world
scenes.
To
address
issues,
we
present
HyperReel―a
novel
representation.
The
two
core
components
HyperReel
are:
(1)
a
ray-conditioned
sample
prediction
network
enables
high-fidelity,
high
frame
rate
at
resolutions
(2)
compact
memory-efficient
dynamic
Our
pipeline
achieves
best
performance
compared
prior
contemporary
approaches
visual
quality
with
requirements,
while
also
up
18
frames-per-second
megapixel
resolution
without
any
custom
CUDA
code.