ACM Transactions on Graphics,
Journal Year:
2021,
Volume and Issue:
40(4), P. 1 - 13
Published: July 19, 2021
Neural
representations
have
emerged
as
a
new
paradigm
for
applications
in
rendering,
imaging,
geometric
modeling,
and
simulation.
Compared
to
traditional
such
meshes,
point
clouds,
or
volumes
they
can
be
flexibly
incorporated
into
differentiable
learning-based
pipelines.
While
recent
improvements
neural
now
make
it
possible
represent
signals
with
fine
details
at
moderate
resolutions
(e.g.,
images
3D
shapes),
adequately
representing
large-scale
complex
scenes
has
proven
challenge.
Current
fail
accurately
greater
than
megapixel
more
few
hundred
thousand
polygons.
Here,
we
introduce
hybrid
implicit-explicit
network
architecture
training
strategy
that
adaptively
allocates
resources
during
inference
based
on
the
local
complexity
of
signal
interest.
Our
approach
uses
multiscale
block-coordinate
decomposition,
similar
quadtree
octree,
is
optimized
training.
The
operates
two
stages:
using
bulk
parameters,
coordinate
encoder
generates
feature
grid
single
forward
pass.
Then,
hundreds
thousands
samples
within
each
block
efficiently
evaluated
lightweight
decoder.
With
this
architecture,
demonstrate
first
experiments
fit
gigapixel
nearly
40
dB
peak
signal-to-noise
ratio.
Notably
represents
an
increase
scale
over
1000X
compared
resolution
previously
demonstrated
image-fitting
experiments.
Moreover,
our
able
shapes
significantly
faster
better
previous
techniques;
reduces
times
from
days
hours
minutes
memory
requirements
by
order
magnitude.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),
Journal Year:
2021,
Volume and Issue:
unknown, P. 7206 - 7215
Published: June 1, 2021
We
present
a
learning-based
method
for
synthesizing
novel
views
of
complex
scenes
using
only
unstructured
collections
in-the-wild
photographs.
build
on
Neural
Radiance
Fields
(NeRF),
which
uses
the
weights
multi-layer
perceptron
to
model
density
and
color
scene
as
function
3D
coordinates.
While
NeRF
works
well
images
static
subjects
captured
under
controlled
settings,
it
is
incapable
modeling
many
ubiquitous,
real-world
phenomena
in
uncontrolled
images,
such
variable
illumination
or
transient
occluders.
introduce
series
extensions
address
these
issues,
thereby
enabling
accurate
reconstructions
from
image
taken
internet.
apply
our
system,
dubbed
NeRF-W,
internet
photo
famous
landmarks,
demonstrate
temporally
consistent
view
renderings
that
are
significantly
closer
photorealism
than
prior
state
art.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),
Journal Year:
2021,
Volume and Issue:
unknown
Published: June 1, 2021
Deep
generative
models
allow
for
photorealistic
image
synthesis
at
high
resolutions.
But
many
applications,
this
is
not
enough:
content
creation
also
needs
to
be
controllable.
While
several
recent
works
investigate
how
disentangle
underlying
factors
of
variation
in
the
data,
most
them
operate
2D
and
hence
ignore
that
our
world
three-dimensional.
Further,
only
few
consider
compositional
nature
scenes.
Our
key
hypothesis
incorporating
a
3D
scene
representation
into
model
leads
more
controllable
synthesis.
Representing
scenes
as
neural
feature
fields
allows
us
one
or
multiple
objects
from
background
well
individual
objects'
shapes
appearances
while
learning
unstructured
unposed
collections
without
any
additional
supervision.
Combining
with
rendering
pipeline
yields
fast
realistic
model.
As
evidenced
by
experiments,
able
translating
rotating
changing
camera
pose.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),
Journal Year:
2021,
Volume and Issue:
unknown, P. 5795 - 5805
Published: June 1, 2021
We
have
witnessed
rapid
progress
on
3D-aware
image
synthesis,
leveraging
recent
advances
in
generative
visual
models
and
neural
rendering.
Existing
approaches
how-ever
fall
short
two
ways:
first,
they
may
lack
an
under-lying
3D
representation
or
rely
view-inconsistent
rendering,
hence
synthesizing
images
that
are
not
multi-view
consistent;
second,
often
depend
upon
network
architectures
expressive
enough,
their
results
thus
quality.
propose
a
novel
model,
named
Periodic
Implicit
Generative
Adversarial
Networks
(π-GAN
pi-GAN),
for
high-quality
synthesis.
π-GAN
leverages
representations
with
periodic
activation
functions
volumetric
rendering
to
represent
scenes
as
view-consistent
radiance
fields.
The
proposed
approach
obtains
state-of-the-art
synthesis
multiple
real
synthetic
datasets.
2021 IEEE/CVF International Conference on Computer Vision (ICCV),
Journal Year:
2021,
Volume and Issue:
unknown, P. 14104 - 14113
Published: Oct. 1, 2021
We
present
MVSNeRF,
a
novel
neural
rendering
approach
that
can
efficiently
reconstruct
radiance
fields
for
view
synthesis.
Unlike
prior
works
on
consider
per-scene
optimization
densely
captured
images,
we
propose
generic
deep
network
from
only
three
nearby
input
views
via
fast
inference.
Our
leverages
plane-swept
cost
volumes
(widely
used
in
multi-view
stereo)
geometry-aware
scene
reasoning,
and
combines
this
with
physically
based
volume
field
reconstruction.
train
our
real
objects
the
DTU
dataset,
test
it
different
datasets
to
evaluate
its
effectiveness
generalizability.
generalize
across
scenes
(even
indoor
scenes,
completely
training
of
objects)
generate
realistic
synthesis
results
using
significantly
outperforming
concurrent
generalizable
Moreover,
if
dense
images
are
captured,
estimated
representation
be
easily
fine-tuned;
leads
reconstruction
higher
quality
substantially
less
time
than
NeRF.
2021 IEEE/CVF International Conference on Computer Vision (ICCV),
Journal Year:
2021,
Volume and Issue:
unknown, P. 5721 - 5731
Published: Oct. 1, 2021
Neural
Radiance
Fields
(NeRF)
[31]
have
recently
gained
a
surge
of
interest
within
the
computer
vision
community
for
its
power
to
synthesize
photorealistic
novel
views
real-world
scenes.
One
limitation
NeRF,
however,
is
requirement
accurate
camera
poses
learn
scene
representations.
In
this
paper,
we
propose
Bundle-Adjusting
(BARF)
training
NeRF
from
imperfect
(or
even
unknown)
—
joint
problem
learning
neural
3D
representations
and
registering
frames.
We
establish
theoretical
connection
classical
image
alignment
show
that
coarse-to-fine
registration
also
applicable
NeRF.
Furthermore,
naïvely
applying
positional
encoding
in
has
negative
impact
on
with
synthesis-based
objective.
Experiments
synthetic
data
BARF
can
effectively
optimize
resolve
large
pose
misalignment
at
same
time.
This
enables
view
synthesis
localization
video
sequences
unknown
poses,
opening
up
new
avenues
visual
systems
(e.g.
SLAM)
potential
applications
dense
mapping
reconstruction.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),
Journal Year:
2021,
Volume and Issue:
unknown, P. 8645 - 8654
Published: June 1, 2021
We
present
dynamic
neural
radiance
fields
for
modeling
the
appearance
and
dynamics
of
a
human
face
1
.
Digitally
reconstructing
talking
is
key
building-block
variety
applications.
Especially,
telepresence
applications
in
AR
or
VR,
faithful
reproduction
including
novel
viewpoint
headposes
required.
In
contrast
to
state-of-the-art
approaches
that
model
geometry
material
properties
explicitly,
are
purely
image-based,
we
introduce
an
implicit
representation
head
based
on
scene
networks.
To
handle
face,
combine
our
network
with
low-dimensional
morphable
which
provides
explicit
control
over
pose
expressions.
use
volumetric
rendering
generate
images
from
this
hybrid
demonstrate
such
can
be
learned
monocular
input
data
only,
without
need
specialized
capture
setup.
experiments,
show
allows
photorealistic
image
generation
surpasses
quality
video-based
reenactment
methods.
ACM Transactions on Graphics,
Journal Year:
2021,
Volume and Issue:
40(6), P. 1 - 12
Published: Dec. 1, 2021
Neural
Radiance
Fields
(NeRF)
are
able
to
reconstruct
scenes
with
unprecedented
fidelity,
and
various
recent
works
have
extended
NeRF
handle
dynamic
scenes.
A
common
approach
such
non-rigid
is
through
the
use
of
a
learned
deformation
field
mapping
from
coordinates
in
each
input
image
into
canonical
template
coordinate
space.
However,
these
deformation-based
approaches
struggle
model
changes
topology,
as
topological
require
discontinuity
field,
but
fields
necessarily
continuous.
We
address
this
limitation
by
lifting
NeRFs
higher
dimensional
space,
representing
5D
radiance
corresponding
individual
slice
"hyper-space".
Our
method
inspired
level
set
methods,
which
evolution
surfaces
slices
surface.
evaluate
our
on
two
tasks:
(i)
interpolating
smoothly
between
"moments",
i.e.,
configurations
scene,
seen
images
while
maintaining
visual
plausibility,
(ii)
novel-view
synthesis
at
fixed
moments.
show
that
method,
we
dub
HyperNeRF
,
outperforms
existing
methods
both
tasks.
Compared
Nerfies,
reduces
average
error
rates
4.1%
for
interpolation
8.6%
synthesis,
measured
LPIPS.
Additional
videos,
results,
visualizations
available
hypernerf.github.io.
2021 IEEE/CVF International Conference on Computer Vision (ICCV),
Journal Year:
2021,
Volume and Issue:
unknown, P. 12939 - 12950
Published: Oct. 1, 2021
We
present
Non-Rigid
Neural
Radiance
Fields
(NR-NeRF),
a
reconstruction
and
novel
view
synthesis
approach
for
general
non-rigid
dynamic
scenes.
Our
takes
RGB
images
of
scene
as
input
(e.g.,
from
monocular
video
recording),
creates
high-quality
space-time
geometry
appearance
representation.
show
that
single
handheld
consumer-grade
camera
is
sufficient
to
synthesize
sophisticated
renderings
virtual
views,
e.g.
'bullet-time'
effect.
NR-NeRF
disentangles
the
into
canonical
volume
its
deformation.
Scene
deformation
implemented
ray
bending,
where
straight
rays
are
deformed
non-rigidly.
also
propose
rigidity
network
better
constrain
rigid
regions
scene,
leading
more
stable
results.
The
bending
trained
without
explicit
supervision.
formulation
enables
dense
correspondence
estimation
across
views
time,
compelling
editing
applications
such
motion
exaggeration.
code
will
be
open
sourced.
Computer Graphics Forum,
Journal Year:
2022,
Volume and Issue:
41(2), P. 641 - 676
Published: May 1, 2022
Abstract
Recent
advances
in
machine
learning
have
led
to
increased
interest
solving
visual
computing
problems
using
methods
that
employ
coordinate‐based
neural
networks.
These
methods,
which
we
call
fields
,
parameterize
physical
properties
of
scenes
or
objects
across
space
and
time.
They
seen
widespread
success
such
as
3D
shape
image
synthesis,
animation
human
bodies,
reconstruction,
pose
estimation.
Rapid
progress
has
numerous
papers,
but
a
consolidation
the
discovered
knowledge
not
yet
emerged.
We
provide
context,
mathematical
grounding,
review
over
250
papers
literature
on
fields.
In
Part
I
focus
field
techniques
by
identifying
common
components
including
different
conditioning,
representation,
forward
map,
architecture,
manipulation
methods.
II
applications
computing,
beyond
(e.g.,
robotics,
audio).
Our
shows
breadth
topics
already
covered
both
historically
current
incarnations,
highlights
improved
quality,
flexibility,
capability
brought
Finally,
present
companion
website
acts
living
database
can
be
continually
updated
community.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),
Journal Year:
2021,
Volume and Issue:
unknown, P. 9416 - 9426
Published: June 1, 2021
We
present
a
method
that
learns
spatiotemporal
neural
irradiance
field
for
dynamic
scenes
from
single
video.
Our
learned
representation
enables
free-viewpoint
rendering
of
the
input
builds
upon
recent
advances
in
implicit
representations.
Learning
video
poses
significant
challenges
because
contains
only
one
observation
scene
at
any
point
time.
The
3D
geometry
can
be
legitimately
represented
numerous
ways
since
varying
(motion)
explained
with
appearance
and
vice
versa.
address
this
ambiguity
by
constraining
time-varying
our
using
depth
estimated
estimation
methods,
aggregating
contents
individual
frames
into
global
representation.
provide
an
extensive
quantitative
evaluation
demonstrate
compelling
results.