HyperReel: High-Fidelity 6-DoF Video with Ray-Conditioned Sampling DOI
Benjamin Attal, Jia‐Bin Huang, Christian Richardt

и другие.

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Год журнала: 2023, Номер 25, С. 16610 - 16620

Опубликована: Июнь 1, 2023

Volumetric scene representations enable photorealistic view synthesis for static scenes and form the basis of several existing 6-DoF video techniques. However, volume rendering procedures that drive these necessitate careful trade-offs in terms quality, speed, memory efficiency. In particular, methods fail to simultaneously achieve real-time performance, small footprint, high-quality challenging real-world scenes. To address issues, we present HyperReel―a novel representation. The two core components HyperReel are: (1) a ray-conditioned sample prediction network enables high-fidelity, high frame rate at resolutions (2) compact memory-efficient dynamic Our pipeline achieves best performance compared prior contemporary approaches visual quality with requirements, while also up 18 frames-per-second megapixel resolution without any custom CUDA code.

Язык: Английский

SPIn-NeRF: Multiview Segmentation and Perceptual Inpainting with Neural Radiance Fields DOI

Ashkan Mirzaei,

Tristan Aumentado‐Armstrong,

Konstantinos G. Derpanis

и другие.

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Год журнала: 2023, Номер unknown

Опубликована: Июнь 1, 2023

Neural Radiance Fields (NeRFs) have emerged as a popular approach for novel view synthesis. While NeRFs are quickly being adapted wider set of applications, intuitively editing NeRF scenes is still an open challenge. One important task the removal unwanted objects from 3D scene, such that replaced region visually plausible and consistent with its context. We refer to this inpainting. In 3D, solutions must be both across multiple views geometrically valid. paper, we propose inpainting method addresses these challenges. Given small posed images sparse annotations in single input image, our framework first rapidly obtains segmentation mask target object. Using mask, perceptual optimization-based then introduced leverages learned 2D image inpainters, distilling their information into space, while ensuring consistency. also address lack diverse benchmark evaluating scene methods by introducing dataset comprised challenging real-world scenes. particular, contains same without object, enabling more principled benchmarking task. demonstrate superiority on multiview segmentation, comparing NeRF-based approaches. evaluate inpainting, establishing state-of-the-art performance against other manipulation algorithms, well strong inpainter baseline.

Язык: Английский

Процитировано

64

ESLAM: Efficient Dense SLAM System Based on Hybrid Representation of Signed Distance Fields DOI
Mohammad Mahdi Johari,

Camilla Carta,

François Fleuret

и другие.

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Год журнала: 2023, Номер unknown, С. 17408 - 17419

Опубликована: Июнь 1, 2023

We present ESLAM, an efficient implicit neural representation method for Simultaneous Localization and Mapping (SLAM). ESLAM reads RGB-D frames with unknown camera poses in a sequential manner incrementally reconstructs the scene while estimating current position scene. incorporate latest advances Neural Radiance Fields (NeRF) into SLAM system, resulting accurate dense visual method. Our consists of multi-scale axis-aligned perpendicular feature planes shallow decoders that, each point continuous space, decode interpolated features Truncated Signed Distance Field (TSDF) RGB values. extensive experiments on three standard datasets, Replica, ScanNet, TUM show that improves accuracy 3D reconstruction localization state-of-the-art methods by more than 50%, it runs up to ×10 faster does not require any pre-training. Project page: https://www.idiap.ch/paper/eslam.

Язык: Английский

Процитировано

63

Panoptic Lifting for 3D Scene Understanding with Neural Fields DOI

Yawar Siddiqui,

Lorenzo Porzi, Samuel Rota Bulò

и другие.

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Год журнала: 2023, Номер unknown

Опубликована: Июнь 1, 2023

We propose Panoptic Lifting, a novel approach for learning panoptic 3D volumetric representations from images of in-the-wild scenes. Once trained, our model can render color together with 3D-consistent segmentation viewpoints. Unlike existing approaches which use input directly or indirectly, method requires only machine-generated 2D masks inferred pre-trained network. Our core contribution is lifting scheme based on neural field representation that generates unified and multi-view consistent, the scene. To account inconsistencies instance identifiers across views, we solve linear assignment cost model's current predictions masks, thus enabling us to lift instances in consistent way. further ablate contributions make more robust noisy, labels, including test-time augmentations confidence estimates, segment consistency loss, bounded fields, gradient stopping. Experimental results validate challenging Hypersim, Replica, ScanNet datasets, improving by 8.4, 13.8, 10.6% scene-level PQ over state art.

Язык: Английский

Процитировано

60

OmniObject3D: Large-Vocabulary 3D Object Dataset for Realistic Perception, Reconstruction and Generation DOI
Tong Wu, Jiarui Zhang, Xiao Fu

и другие.

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Год журнала: 2023, Номер unknown, С. 803 - 814

Опубликована: Июнь 1, 2023

Recent advances in modeling 3D objects mostly rely on synthetic datasets due to the lack of large-scale real-scanned databases. To facilitate development perception, reconstruction, and generation real world, we propose OmniObject3D, a large vocabulary object dataset with massive high-quality objects. OmniObject3D has several appealing properties: 1) Large Vocabulary: It comprises 6,000 scanned 190 daily categories, sharing common classes popular 2D (e.g., ImageNet LVIS), benefiting pursuit generalizable representations. 2) Rich Annotations: Each is captured both sensors, providing textured meshes, point clouds, multi-view rendered images, multiple real-captured videos. 3) Realistic Scans: The professional scanners support scans precise shapes realistic appearances. With vast exploration space offered by carefully set up four evaluation tracks: a) robust b) novel-view synthesis, c) neural surface d) generation. Extensive studies are performed these benchmarks, revealing new observations, challenges, opportunities for future research vision.

Язык: Английский

Процитировано

58

HyperReel: High-Fidelity 6-DoF Video with Ray-Conditioned Sampling DOI
Benjamin Attal, Jia‐Bin Huang, Christian Richardt

и другие.

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Год журнала: 2023, Номер 25, С. 16610 - 16620

Опубликована: Июнь 1, 2023

Volumetric scene representations enable photorealistic view synthesis for static scenes and form the basis of several existing 6-DoF video techniques. However, volume rendering procedures that drive these necessitate careful trade-offs in terms quality, speed, memory efficiency. In particular, methods fail to simultaneously achieve real-time performance, small footprint, high-quality challenging real-world scenes. To address issues, we present HyperReel―a novel representation. The two core components HyperReel are: (1) a ray-conditioned sample prediction network enables high-fidelity, high frame rate at resolutions (2) compact memory-efficient dynamic Our pipeline achieves best performance compared prior contemporary approaches visual quality with requirements, while also up 18 frames-per-second megapixel resolution without any custom CUDA code.

Язык: Английский

Процитировано

55