In-Place Scene Labelling and Understanding with Implicit Scene Representation DOI
Shuaifeng Zhi, Tristan Laidlow, Stefan Leutenegger

et al.

2021 IEEE/CVF International Conference on Computer Vision (ICCV), Journal Year: 2021, Volume and Issue: unknown, P. 15818 - 15827

Published: Oct. 1, 2021

Semantic labelling is highly correlated with geometry and radiance reconstruction, as scene entities similar shape appearance are more likely to come from classes. Recent implicit neural reconstruction techniques appealing they do not require prior training data, but the same fully self-supervised approach possible for semantics because labels human-defined properties.We extend fields (NeRF) jointly encode geometry, so that complete accurate 2D semantic can be achieved using a small amount of in-place annotations specific scene. The intrinsic multi-view consistency smoothness NeRF benefit by enabling sparse efficiently propagate. We show this when either or very noisy in room-scale scenes. demonstrate its advantageous properties various interesting applications such an efficient tool, novel view synthesis, label denoising, super-resolution, interpolation fusion visual mapping systems.

Language: Английский

Discovering Physical Concepts with Neural Networks DOI
Raban Iten, Tony Metger, Henrik Wilming

et al.

Physical Review Letters, Journal Year: 2020, Volume and Issue: 124(1)

Published: Jan. 8, 2020

Despite the success of neural networks at solving concrete physics problems, their use as a general-purpose tool for scientific discovery is still in its infancy. Here, we approach this problem by modeling network architecture after human physical reasoning process, which has similarities to representation learning. This allows us make progress towards long-term goal machine-assisted from experimental data without making prior assumptions about system. We apply method toy examples and show that finds physically relevant parameters, exploits conservation laws predictions, can help gain conceptual insights, e.g., Copernicus' conclusion solar system heliocentric.Received 17 July 2019DOI:https://doi.org/10.1103/PhysRevLett.124.010508© 2020 American Physical SocietyPhysics Subject Headings (PhySH)Research AreasMachine learningQuantum foundationsQuantum tomographyPhysical SystemsArtificial networksInterdisciplinary PhysicsQuantum Information

Language: Английский

Citations

423

BARF: Bundle-Adjusting Neural Radiance Fields DOI
Chen-Hsuan Lin, Wei-Chiu Ma, Antonio Torralba

et al.

2021 IEEE/CVF International Conference on Computer Vision (ICCV), Journal Year: 2021, Volume and Issue: unknown, P. 5721 - 5731

Published: Oct. 1, 2021

Neural Radiance Fields (NeRF) [31] have recently gained a surge of interest within the computer vision community for its power to synthesize photorealistic novel views real-world scenes. One limitation NeRF, however, is requirement accurate camera poses learn scene representations. In this paper, we propose Bundle-Adjusting (BARF) training NeRF from imperfect (or even unknown) — joint problem learning neural 3D representations and registering frames. We establish theoretical connection classical image alignment show that coarse-to-fine registration also applicable NeRF. Furthermore, naïvely applying positional encoding in has negative impact on with synthesis-based objective. Experiments synthetic data BARF can effectively optimize resolve large pose misalignment at same time. This enables view synthesis localization video sequences unknown poses, opening up new avenues visual systems (e.g. SLAM) potential applications dense mapping reconstruction.

Language: Английский

Citations

409

Dream to Control: Learning Behaviors by Latent Imagination DOI Creative Commons
Danijar Hafner, Timothy Lillicrap, Jimmy Ba

et al.

arXiv (Cornell University), Journal Year: 2019, Volume and Issue: unknown

Published: Jan. 1, 2019

Learned world models summarize an agent's experience to facilitate learning complex behaviors. While from high-dimensional sensory inputs is becoming feasible through deep learning, there are many potential ways for deriving behaviors them. We present Dreamer, a reinforcement agent that solves long-horizon tasks images purely by latent imagination. efficiently learn propagating analytic gradients of learned state values back trajectories imagined in the compact space model. On 20 challenging visual control tasks, Dreamer exceeds existing approaches data-efficiency, computation time, and final performance.

Language: Английский

Citations

395

State of the Art on Neural Rendering DOI Creative Commons
Ayush Tewari, Ohad Fried, Justus Thies

et al.

Computer Graphics Forum, Journal Year: 2020, Volume and Issue: 39(2), P. 701 - 727

Published: May 1, 2020

Abstract Efficient rendering of photo‐realistic virtual worlds is a long standing effort computer graphics. Modern graphics techniques have succeeded in synthesizing images from hand‐crafted scene representations. However, the automatic generation shape, materials, lighting, and other aspects scenes remains challenging problem that, if solved, would make more widely accessible. Concurrently, progress vision machine learning given rise to new approach image synthesis editing, namely deep generative models. Neural rapidly emerging field that combines with physical knowledge graphics, e.g., by integration differentiable into network training. With plethora applications vision, neural poised become area community, yet no survey this exists. This state‐of‐the‐art report summarizes recent trends rendering. We focus on approaches combine classic models obtain controllable photorealistic outputs. Starting an overview underlying concepts, we discuss critical approaches. Specifically, our emphasis type control, i.e., how control provided, which parts pipeline are learned, explicit vs. implicit generalization, stochastic deterministic synthesis. The second half focused many important use cases for described algorithms such as novel view synthesis, semantic photo manipulation, facial body reenactment, relighting, free‐viewpoint video, creation avatars augmented reality telepresence. Finally, conclude discussion social implications technology investigate open research problems.

Language: Английский

Citations

368

SynSin: End-to-End View Synthesis From a Single Image DOI
Olivia Wiles,

Georgia Gkioxari,

Richard Szeliski

et al.

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Journal Year: 2020, Volume and Issue: unknown

Published: June 1, 2020

View synthesis allows for the generation of new views a scene given one or more images. This is challenging; it requires comprehensively understanding 3D from As result, current methods typically use multiple images, train on ground-truth depth, are limited to synthetic data. We propose novel end-to-end model this task using single image at test time; trained real images without any information. To end, we introduce differentiable point cloud renderer that used transform latent features into target view. The projected decoded by our refinement network inpaint missing regions and generate realistic output image. component inside generative interpretable manipulation feature space time, e.g. can animate trajectories Additionally, high resolution generalise other input resolutions. outperform baselines prior work Matterport, Replica, RealEstate10K datasets.

Language: Английский

Citations

365

If deep learning is the answer, what is the question? DOI
Andrew Saxe, Stephanie Nelli, Christopher Summerfield

et al.

Nature reviews. Neuroscience, Journal Year: 2020, Volume and Issue: 22(1), P. 55 - 67

Published: Nov. 16, 2020

Language: Английский

Citations

343

Pushing the Boundaries of View Extrapolation With Multiplane Images DOI
Pratul P. Srinivasan, Richard Tucker,

Jonathan T. Barron

et al.

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Journal Year: 2019, Volume and Issue: unknown

Published: June 1, 2019

We explore the problem of view synthesis from a narrow baseline pair images, and focus on generating high-quality extrapolations with plausible disocclusions. Our method builds upon prior work in predicting multiplane image (MPI), which represents scene content as set RGBA planes within reference frustum renders novel views by projecting this into target viewpoints. present theoretical analysis showing how range that can be rendered an MPI increases linearly disparity sampling frequency, well prediction procedure theoretically enables up to 4 times lateral viewpoint movement allowed work. ameliorates two specific issues limit renderable methods: 1) expand without depth discretization artifacts using 3D convolutional network architecture along randomized-resolution training allow our model predict MPIs increased frequency. 2) reduce repeated texture seen disocclusions enforcing constraint appearance hidden at any must drawn visible or behind depth.

Language: Английский

Citations

338

Shared Neural Mechanisms of Visual Perception and Imagery DOI
Nadine Dijkstra, Sander Bosch, Marcel van Gerven

et al.

Trends in Cognitive Sciences, Journal Year: 2019, Volume and Issue: 23(5), P. 423 - 434

Published: March 12, 2019

Language: Английский

Citations

317

Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synthesis of a Dynamic Scene From Monocular Video DOI

Edgar Tretschk,

Ayush Tewari, Vladislav Golyanik

et al.

2021 IEEE/CVF International Conference on Computer Vision (ICCV), Journal Year: 2021, Volume and Issue: unknown, P. 12939 - 12950

Published: Oct. 1, 2021

We present Non-Rigid Neural Radiance Fields (NR-NeRF), a reconstruction and novel view synthesis approach for general non-rigid dynamic scenes. Our takes RGB images of scene as input (e.g., from monocular video recording), creates high-quality space-time geometry appearance representation. show that single handheld consumer-grade camera is sufficient to synthesize sophisticated renderings virtual views, e.g. 'bullet-time' effect. NR-NeRF disentangles the into canonical volume its deformation. Scene deformation implemented ray bending, where straight rays are deformed non-rigidly. also propose rigidity network better constrain rigid regions scene, leading more stable results. The bending trained without explicit supervision. formulation enables dense correspondence estimation across views time, compelling editing applications such motion exaggeration. code will be open sourced.

Language: Английский

Citations

314

Space-time Neural Irradiance Fields for Free-Viewpoint Video DOI
Wenqi Xian, Jia‐Bin Huang, Johannes Kopf

et al.

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Journal Year: 2021, Volume and Issue: unknown, P. 9416 - 9426

Published: June 1, 2021

We present a method that learns spatiotemporal neural irradiance field for dynamic scenes from single video. Our learned representation enables free-viewpoint rendering of the input builds upon recent advances in implicit representations. Learning video poses significant challenges because contains only one observation scene at any point time. The 3D geometry can be legitimately represented numerous ways since varying (motion) explained with appearance and vice versa. address this ambiguity by constraining time-varying our using depth estimated estimation methods, aggregating contents individual frames into global representation. provide an extensive quantitative evaluation demonstrate compelling results.

Language: Английский

Citations

287