GLOBAL MATCHING-OPTIMIZATION NETWORK FOR STEREO DEPTH ESTIMATION
Yidi Zhang (Tsinghua University); Wenqi Huang (China southern power grid); Wenming Yang (Tsinghua University)
-
SPS
IEEE Members: $11.00
Non-members: $15.00
Recently, iterative optimization-based approaches have gained tremendous progress in the field of stereo matching. However, it still remains a challenge to accurately estimate disparity for occlusion and textureless regions. To address this challenge, we present the Global Matching-Optimization Stereo Network (GMOStereo), which contains three components: Conv-Trans Feature Extraction Module (C-TFEM), Global Matching Module (GMM), and scene-aware disparity optimization. Before iterative optimization, attention-based GMM builds stable interdependence across distinct views. The C-TFEM, which extracts features through a two-branch network based of convolution blocks and transformer blocks, is designed to obtain global representations of features while preserving fine-grained information. The scene self-similarity adopted in disparity optimization provides supplement for matching information. Finally, a Matching-Optimization loss is designed to guide the training by imposing a direct constraint on the correlation volume. Evaluation demonstrates that GMOStereo achieves superior cross-dataset generalization performance and outperforms typical methods in the foreground and challenging regions on KITTI-2015 benchmarks.