MUTUAL RELATIVE POSITION LEARNING TRANSFORMER FOR CROSS-VIEW GEO-LOCALIZATION

Bo Gu, Hefei Ling, Yuxuan Shi, Zongyi Li, Chuang Zhao, Ping Li, Qiang Cao

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Lecture 10 Oct 2023

Cross-view geo-localization refers to matching ground images with geo-tagged satellite imagery. Existing methods are mainly two-stage, applying a polar transform to roughly eliminate the gap between these two domains, but this might introduce distortions and reduce the discriminativeness of features. In this work, we propose a transformer-based one-stage approach, which unifies gap elimination and feature extraction. The relative position among objects provides critical clues for this task and has strong spatial correspondences between the two views. Firstly, we form the relative position by selecting representative tokens from different regions. Then the relative positions of the two views predict each other and eliminate the gap through mutual learning. Finally, we introduce a novel consistency loss to enhance feature learning by mutual transfer of relational knowledge among samples. Extensive experiments demonstrate that our method achieves state-of-the-art results on both standard and fine-grained datasets.

Tags:

image retrieval

deep learning

geo-localization

transformer

MUTUAL RELATIVE POSITION LEARNING TRANSFORMER FOR CROSS-VIEW GEO-LOCALIZATION

Bo Gu, Hefei Ling, Yuxuan Shi, Zongyi Li, Chuang Zhao, Ping Li, Qiang Cao

More Like This

Signal Processing and Deep Learning for Practical Active Noise Control

Short Course Bundle: ICASSP 2023 COURSE 2: Graph Signal Processing and Geometric Learning: A Foundational Approach (Parts 1-4)

Short Course Bundle: ICASSP 2023 COURSE 1: A Hands-on Approach for Implementing Stochastic Optimization Algorithms from Scratch (Parts 1-4)

Join the IEEE Signal Processing Society