SiLK: Simple Learned Keypoints DOI

Pierre Gleize,

Wei‐Yao Wang,

Matt Feiszli

et al.

2021 IEEE/CVF International Conference on Computer Vision (ICCV), Journal Year: 2023, Volume and Issue: unknown, P. 22442 - 22451

Published: Oct. 1, 2023

Keypoint detection & descriptors are foundational technologies for computer vision tasks like image matching, 3D reconstruction and visual odometry. Hand-engineered methods Harris corners, SIFT, HOG have been used decades; more recently, there has a trend to introduce learning in an attempt improve key-point detectors. On inspection however, the results difficult interpret; recent learning-based employ vast diversity of experimental setups design choices: empirical often reported using different backbones, protocols, datasets, types supervisions or tasks. Since these differences coupled together, it raises natural question on what makes good learned keypoint detector. In this work, we revisit existing detectors by deconstructing their methodologies identifying key components. We re-design each component from first-principle propose Simple Learned Keypoints (SiLK) that is fully-differentiable, lightweight, flexible. Despite its simplicity, SiLK advances new state-of-the-art Detection Repeatability Homography Estimation HPatches Point-Cloud Registration task ScanNet, achieves competitive performance camera pose estimation 2022 Image Matching Challenge ScanNet.

Language: Английский

Recent advances and clinical applications of deep learning in medical image analysis DOI Creative Commons
Xuxin Chen, Ximin Wang, Ke Zhang

et al.

Medical Image Analysis, Journal Year: 2022, Volume and Issue: 79, P. 102444 - 102444

Published: April 4, 2022

Language: Английский

Citations

572

Image fusion meets deep learning: A survey and perspective DOI
Hao Zhang, Han Xu, Xin Tian

et al.

Information Fusion, Journal Year: 2021, Volume and Issue: 76, P. 323 - 336

Published: July 6, 2021

Language: Английский

Citations

501

SDNet: A Versatile Squeeze-and-Decomposition Network for Real-Time Image Fusion DOI
Hao Zhang, Jiayi Ma

International Journal of Computer Vision, Journal Year: 2021, Volume and Issue: 129(10), P. 2761 - 2785

Published: July 30, 2021

Language: Английский

Citations

362

A review of multimodal image matching: Methods and applications DOI
Xingyu Jiang, Jiayi Ma, Guobao Xiao

et al.

Information Fusion, Journal Year: 2021, Volume and Issue: 73, P. 22 - 71

Published: March 1, 2021

Language: Английский

Citations

345

A review on deep learning in UAV remote sensing DOI Creative Commons
Lucas Prado Osco, José Marcato, Ana Paula Marques Ramos

et al.

International Journal of Applied Earth Observation and Geoinformation, Journal Year: 2021, Volume and Issue: 102, P. 102456 - 102456

Published: July 27, 2021

Deep Neural Networks (DNNs) learn representation from data with an impressive capability, and brought important breakthroughs for processing images, time-series, natural language, audio, video, many others. In the remote sensing field, surveys literature revisions specifically involving DNNs algorithms' applications have been conducted in attempt to summarize amount of information produced its subfields. Recently, Unmanned Aerial Vehicle (UAV)-based dominated aerial research. However, a revision that combines both "deep learning" "UAV sensing" thematics has not yet conducted. The motivation our work was present comprehensive review fundamentals Learning (DL) applied UAV-based imagery. We focused mainly on describing classification regression techniques used recent UAV-acquired data. For that, total 232 papers published international scientific journal databases examined. gathered materials evaluated their characteristics regarding application, sensor, technique used. discuss how DL presents promising results potential tasks associated image Lastly, we project future perspectives, commentating prominent paths be explored UAV field. This consisting approach introduce, commentate, state-of-the-art algorithms diverse subfields sensing, grouping it environmental, urban, agricultural contexts.

Language: Английский

Citations

340

A Multiscale Framework With Unsupervised Learning for Remote Sensing Image Registration DOI
Yuanxin Ye, Tengfeng Tang, Bai Zhu

et al.

IEEE Transactions on Geoscience and Remote Sensing, Journal Year: 2022, Volume and Issue: 60, P. 1 - 15

Published: Jan. 1, 2022

Registration for multisensor or multimodal image pairs with a large degree of distortions is fundamental task many remote sensing applications. To achieve accurate and low-cost registration, we propose multiscale framework unsupervised learning, named MU-Net. Without costly ground truth labels, MU-Net directly learns the end-to-end mapping from to their transformation parameters. stacks several deep neural network (DNN) models on multiple scales generate coarse-to-fine registration pipeline, which prevents backpropagation falling into local extremum resists significant distortions. We design novel loss function paradigm based structural similarity, makes suitable various types images. compared traditional feature-based area-based methods, as well supervised other learning methods optical-optical, optical-infrared, optical-synthetic aperture radar (SAR), optical-map datasets. Experimental results show that achieves more comprehensive performance between these geometric radiometric share code implemented by Pytorch at https://github.com/yeyuanxin110/MU-Net .

Language: Английский

Citations

114

R₂FD₂: Fast and Robust Matching of Multimodal Remote Sensing Images via Repeatable Feature Detector and Rotation-Invariant Feature Descriptor DOI
Bai Zhu, Chao Yang, Jinkun Dai

et al.

IEEE Transactions on Geoscience and Remote Sensing, Journal Year: 2023, Volume and Issue: 61, P. 1 - 15

Published: Jan. 1, 2023

Identifying feature correspondences between multimodal images is facing enormous challenges because of the significant differences both in radiation and geometry. To address these problems, we propose a novel matching method (named R 2 FD ) that robust to rotation differences, which consists repeatable detector rotation-invariant descriptor. In first stage, called Multi-channel Auto-correlation Log-Gabor (MALG) presented for detection, combines multi-channel auto-correlation strategy with wavelets detect interest points (IPs) high repeatability uniform distribution. second descriptor constructed, named Rotation-invariant Maximum index map (RMLG), includes fast assignment dominant orientation construction representation. process orientation, Index Map (RMIM) built deformations. Then, proposed RMLG incorporates RMIM spatial configuration DAISY improve RMLG's resistance variances. Finally, conduct experiments validate performance our utilizing different types image datasets. Experimental results show outperforms five state-of-the-art methods. Moreover, achieves accuracy within two pixels has great advantage efficiency over contrastive

Language: Английский

Citations

78

MURF: Mutually Reinforcing Multi-Modal Image Registration and Fusion DOI
Han Xu, Jiteng Yuan, Jiayi Ma

et al.

IEEE Transactions on Pattern Analysis and Machine Intelligence, Journal Year: 2023, Volume and Issue: 45(10), P. 12148 - 12166

Published: June 7, 2023

Existing image fusion methods are typically limited to aligned source images and have "tolerate" parallaxes when unaligned. Simultaneously, the large variances between different modalities pose a significant challenge for multi-modal registration. This study proposes novel method called MURF, where first time, registration mutually reinforced rather than being treated as separate issues. MURF leverages three modules: shared information extraction module (SIEM), multi-scale coarse (MCRM), fine (F2M). The is carried out in coarse-to-fine manner. During registration, SIEM transforms into mono-modal eliminate modal variances. Then, MCRM progressively corrects global rigid parallaxes. Subsequently, repair local non-rigid offsets uniformly implemented F2M. fused provides feedback improve accuracy, improved result further improves result. For fusion, solely preserving original existing methods, we attempt incorporate texture enhancement fusion. We test on four types of data (RGB-IR, RGB-NIR, PET-MRI, CT-MRI). Extensive results validate superiority universality MURF.

Language: Английский

Citations

78

SACF-Net: Skip-Attention Based Correspondence Filtering Network for Point Cloud Registration DOI
Yue Wu,

Xidao Hu,

Yue Zhang

et al.

IEEE Transactions on Circuits and Systems for Video Technology, Journal Year: 2023, Volume and Issue: 33(8), P. 3585 - 3595

Published: Jan. 16, 2023

Rigid registration is a transformation estimation problem between two point clouds. The clouds captured may partially overlap owing to different viewpoints and acquisition times. Some previous correspondence matching based methods utilize an encoder-decoder network carry out partial-to-partial task adopt skip-connection structure convey information the encoder decoder. However, equally revisiting them with introduce redundancy, limit feature learning ability of entire network. To address these problems, we propose skip-attention filtering (SACF-Net) for cloud registration. A novel interaction mechanism designed both low-level geometric high-level context-aware enhance original pointwise map. Additionally, method proposed selectively revisits features in at resolutions, allowing decoder extract high-quality correspondences within overlapping regions. We conduct comprehensive experiments on indoor outdoor scene datasets, results show that SACF-Net yields unprecedented performance improvements.

Language: Английский

Citations

76

A Unified Transformer Framework for Group-Based Segmentation: Co-Segmentation, Co-Saliency Detection and Video Salient Object Detection DOI
Yukun Su,

Jingliang Deng,

Ruizhou Sun

et al.

IEEE Transactions on Multimedia, Journal Year: 2023, Volume and Issue: 26, P. 313 - 325

Published: April 5, 2023

Humans tend to mine objects by learning from a group of images or several frames video since we live in dynamic world. In the computer vision area, many researchers focus on co-segmentation (CoS), co-saliency detection (CoSD) and salient object (VSOD) discover co-occurrent objects. However, previous approaches design different networks for these similar tasks separately, they are difficult apply each other. Besides, fail take full advantage cues among inter- intra-feature within images. this paper, introduce unified framework tackle issues view, term as UFGS ( xmlns:xlink="http://www.w3.org/1999/xlink">U nified xmlns:xlink="http://www.w3.org/1999/xlink">F ramework xmlns:xlink="http://www.w3.org/1999/xlink">G roup-based xmlns:xlink="http://www.w3.org/1999/xlink">S egmentation). Specifically, first transformer block, which views image feature patch token then captures their long-range dependencies through self-attention mechanism. This can help network excavate patch-structured similarities relevant Furthermore, propose an intra-MLP module produce self-mask enhance avoid partial activation. Extensive experiments four CoS benchmarks (PASCAL, iCoseg Internet MSRC), three CoSD (Cosal2015, CoSOD3k, CocA) five VSOD (DAVIS $_{16}$ , FBMS, ViSal, SegV2, DAVSOD) show that our method outperforms other state-of-the-arts both accuracy speed using same architecture, reach 140 FPS real-time.

Language: Английский

Citations

73