On Monocular Depth Estimation and Uncertainty Quantification Using Classification Approaches For Regression
Xuanlong Yu, Gianni Franchi, Emanuel Aldea
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:08:58
Fine-grained texture classification differentiates between similar materials. When there are large unlabeled datasets available, representation learning is useful to distinguish between classes. in this paper, we show that harnessing contrastive self-supervised learning (SSL) for visual representations leads to performance gains for fine-grained texture classification. We demonstrate that, in the absence of sufficient labeled training data, SSL pre-training provides better representation for classification, when compared to supervised methods. We propose a novel pretext task, part-to-whole, in which we use the property of textures that a randomly cropped patch is similar in structure to the whole image. We also propose the usage of representations that are tapped from multiple layers of a convolutional neural network (CNN) and show the effectiveness of combining high-level and low-level features in improving discriminability. We present extensive experiments on the ground-terrain outdoor scenes (GTOS) dataset and show that multi-layer global average pooling (multi-GAP) representations from EfficientNet-B4 model trained using part-to-whole pretext task, beats the current state-of-the-art (SOTA) methods on single-view material classification in limited labeled data settings.