2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),
Journal Year:
2023,
Volume and Issue:
25, P. 16610 - 16620
Published: June 1, 2023
Volumetric
scene
representations
enable
photorealistic
view
synthesis
for
static
scenes
and
form
the
basis
of
several
existing
6-DoF
video
techniques.
However,
volume
rendering
procedures
that
drive
these
necessitate
careful
trade-offs
in
terms
quality,
speed,
memory
efficiency.
In
particular,
methods
fail
to
simultaneously
achieve
real-time
performance,
small
footprint,
high-quality
challenging
real-world
scenes.
To
address
issues,
we
present
HyperReel―a
novel
representation.
The
two
core
components
HyperReel
are:
(1)
a
ray-conditioned
sample
prediction
network
enables
high-fidelity,
high
frame
rate
at
resolutions
(2)
compact
memory-efficient
dynamic
Our
pipeline
achieves
best
performance
compared
prior
contemporary
approaches
visual
quality
with
requirements,
while
also
up
18
frames-per-second
megapixel
resolution
without
any
custom
CUDA
code.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),
Journal Year:
2022,
Volume and Issue:
unknown
Published: June 1, 2022
We
present
a
super-fast
convergence
approach
to
reconstructing
the
per-scene
radiance
field
from
set
of
images
that
capture
scene
with
known
poses.
This
task,
which
is
often
applied
novel
view
synthesis,
recently
revolution-ized
by
Neural
Radiance
Field
(NeRF)
for
its
state-of-the-art
quality
and
fiexibility.
However,
NeRF
variants
require
lengthy
training
time
ranging
hours
days
single
scene.
In
contrast,
our
achieves
NeRF-comparable
converges
rapidly
scratch
in
less
than
15
minutes
GPU.
adopt
representation
consisting
density
voxel
grid
geometry
feature
shallow
network
complex
view-dependent
appearance.
Modeling
explicit
discretized
volume
representations
not
new,
but
we
propose
two
simple
yet
non-trivial
techniques
contribute
fast
speed
high-quality
output.
First,
introduce
post-activation
interpolation
on
density,
capable
producing
sharp
surfaces
lower
resolution.
Second,
direct
optimization
prone
suboptimal
solutions,
so
robustify
process
imposing
several
priors.
Finally,
evaluation
five
inward-facing
benchmarks
shows
method
matches,
if
surpasses,
NeRF's
quality,
it
only
takes
about
train
new
Code:
https://github.com/sunset1995/DirectVoxGO.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),
Journal Year:
2023,
Volume and Issue:
unknown, P. 12479 - 12488
Published: June 1, 2023
We
introduce
k-planes,
a
white-box
model
for
radiance
fields
in
arbitrary
dimensions.
Our
uses
planes
to
represent
d-dimensional
scene,
providing
seamless
way
go
from
static
(d
=
3)
dynamic
(d=
4)
scenes.
This
planar
factorization
makes
adding
dimension-specific
priors
easy,
e.g.
temporal
smoothness
and
multi-resolution
spatial
structure,
induces
natural
decomposition
of
components
scene.
use
linear
feature
decoder
with
learned
color
basis
that
yields
similar
performance
as
nonlinear
black-box
MLP
decoder.
Across
range
synthetic
real,
dynamic,
fixed
varying
appearance
scenes,
k-planes
competitive
often
state-of-the-art
recon-
struction
fidelity
low
memory
usage,
achieving
1000x
compression
over
full
4D
grid,
fast
optimization
pure
PyTorch
implementation.
For
video
results
code,
please
see
sarafridov.github.io/K-Planes.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),
Journal Year:
2023,
Volume and Issue:
unknown
Published: June 1, 2023
Modeling
and
re-rendering
dynamic
3D
scenes
is
a
challenging
task
in
vision.
Prior
approaches
build
on
NeRF
rely
implicit
representations.
This
slow
since
it
requires
many
MLP
evaluations,
constraining
real-world
applications.
We
show
that
can
be
explicitly
represented
by
six
planes
of
learned
features,
leading
to
an
elegant
solution
we
call
HexPlane.
A
HexPlane
computes
features
for
points
spacetime
fusing
vectors
extracted
from
each
plane,
which
highly
efficient.
Pairing
with
tiny
regress
output
colors
training
via
volume
rendering
gives
impressive
results
novel
view
synthesis
scenes,
matching
the
image
quality
prior
work
but
reducing
time
more
than
100×.
Extensive
ablations
confirm
our
design
robust
different
feature
fusion
mechanisms,
coordinate
systems,
decoding
mechanisms.
simple
effective
representing
4D
volumes,
hope
they
broadly
contribute
modeling
scenes.
1
Project
page:
https://caoang327.github.io/HexPlane.
2021 International Conference on 3D Vision (3DV),
Journal Year:
2024,
Volume and Issue:
35, P. 800 - 809
Published: March 18, 2024
We
present
a
method
that
simultaneously
addresses
the
tasks
of
dynamic
scene
novel-view
synthesis
and
six
degree-of-freedom
(6-DOF)
tracking
all
dense
elements.
follow
an
analysis-by-synthesis
framework,
inspired
by
recent
work
models
scenes
as
collection
3D
Gaussians
which
are
optimized
to
reconstruct
input
images
via
differentiable
rendering.
To
model
scenes,
we
allow
move
rotate
over
time
while
enforcing
they
have
persistent
color,
opacity,
size.
By
regularizing
Gaussians'
motion
rotation
with
local-rigidity
constraints,
show
our
Dynamic
correctly
same
area
physical
space
time,
including
space.
Dense
6-DOF
reconstruction
emerges
naturally
from
view
synthesis,
without
requiring
any
correspondence
or
flow
input.
demonstrate
large
number
downstream
applications
enabled
representation,
first-person
compositional
4D
video
editing.
1
1.
Project
Website:
dynamic3dgaussians.github.io
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),
Journal Year:
2022,
Volume and Issue:
unknown, P. 18332 - 18343
Published: June 1, 2022
Implicit
neural
rendering,
especially
Neural
Radiance
Field
(NeRF),
has
shown
great
potential
in
novel
view
synthesis
of
a
scene.
However,
current
NeRF-based
methods
cannot
enable
users
to
perform
user-controlled
shape
deformation
the
While
existing
works
have
proposed
some
approaches
modify
radiance
field
according
user's
constraints,
modification
is
limited
color
editing
or
object
translation
and
rotation.
In
this
paper,
we
propose
method
that
allows
controllable
on
implicit
representation
scene,
synthesizes
images
edited
scene
without
re-training
network.
Specifically,
establish
correspondence
between
extracted
explicit
mesh
target
Users
can
first
utilize
well-developed
mesh-based
deform
Our
then
utilizes
user
edits
from
bend
camera
rays
by
introducing
tetrahedra
as
proxy,
obtaining
rendering
results
Extensive
experiments
demonstrate
our
framework
achieve
ideal
not
only
synthetic
data,
but
also
real
scenes
captured
users.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),
Journal Year:
2022,
Volume and Issue:
unknown, P. 20342 - 20352
Published: June 1, 2022
In
this
paper,
we
propose
HeadNeRF,
a
novel
NeRF-based
parametric
head
model
that
integrates
the
neural
radiance
field
to
representation
of
human
head.
It
can
render
high
fidelity
images
in
real-time
on
modern
GPUs,
and
supports
directly
controlling
generated
images'
rendering
pose
various
semantic
attributes.
Different
from
existing
related
models,
use
fields
as
3D
proxy
instead
traditional
textured
mesh,
which
makes
HeadNeRF
is
able
generate
images.
However,
computationally
expensive
process
original
NeRF
hinders
construction
model.
To
address
issue,
adopt
strategy
integrating
2D
design
loss
terms.
As
result,
speed
be
significantly
accelerated,
time
one
frame
reduced
5s
25ms.
The
well
designed
terms
also
improve
accuracy,
fine-level
details
head,
such
gaps
between
teeth,
wrinkles,
beards,
represented
synthesized
by
HeadNeRF.
Extensive
experimental
results
several
applications
demonstrate
its
effectiveness.
trained
available
at
https://github.com/CrisHY1995/headnerf.
Neural
radiance
fields
(NeRF)
have
shown
great
success
in
modeling
3D
scenes
and
synthesizing
novel-view
images.
However,
most
previous
NeRF
methods
take
much
time
to
optimize
one
single
scene.
Explicit
data
structures,
e.g.
voxel
features,
show
potential
accelerate
the
training
process.
features
face
two
big
challenges
be
applied
dynamic
scenes,
i.e.
temporal
information
capturing
different
scales
of
point
motions.
We
propose
a
field
framework
by
representing
with
time-aware
named
as
TiNeuVox.
A
tiny
coordinate
deformation
network
is
introduced
model
coarse
motion
trajectories
further
enhanced
network.
multi-distance
interpolation
method
proposed
on
both
small
large
Our
significantly
accelerates
optimization
while
maintaining
high
rendering
quality.
Empirical
evaluation
performed
synthetic
real
scenes.
TiNeuVox
completes
only
8
minutes
8-MB
storage
cost
showing
similar
or
even
better
performance
than
methods.