Multi-Scale Deformable Transformer Encoder Based Single-Stage Pedestrian Detection
Jing Yuan, Panagiotis Barmpoutis, Tania Stathaki
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:07:06
Deep neural networks targeting stereo disparity estimation have recently surpassed the performance of hand-crafted traditional models. However, training these networks require large labeled databases for obtaining accurate disparity estimates. in this letter, we address the large data requirement by generating synthetic data using natural image statistics. Images generated using dead leaves model have been shown to share many statistical characteristics commonly seen in natural images. in this work, we created a synthetic dataset using the 3D dead leaves model consisting of occluding spheres, and projected them onto parallel camera planes to obtain stereo image pairs along with ground-truth disparity map. This generated data was subsequently used to train a deep neural network in a supervised manner to estimate disparity. Through experiments we show that this trained model achieves competitive performance across real-world and synthetic stereo datasets, even without any additional fine-tuning. The proposed method for dataset generation is simplistic in nature, computationally inexpensive and can be easily scaled for large scale data generation.