On Monocular Depth Estimation and Uncertainty Quantification Using Classification Approaches For Regression

Xuanlong Yu, Gianni Franchi, Emanuel Aldea

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:08:58

18 Oct 2022

Fine-grained texture classification differentiates between similar materials. When there are large unlabeled datasets available, representation learning is useful to distinguish between classes. in this paper, we show that harnessing contrastive self-supervised learning (SSL) for visual representations leads to performance gains for fine-grained texture classification. We demonstrate that, in the absence of sufficient labeled training data, SSL pre-training provides better representation for classification, when compared to supervised methods. We propose a novel pretext task, part-to-whole, in which we use the property of textures that a randomly cropped patch is similar in structure to the whole image. We also propose the usage of representations that are tapped from multiple layers of a convolutional neural network (CNN) and show the effectiveness of combining high-level and low-level features in improving discriminability. We present extensive experiments on the ground-terrain outdoor scenes (GTOS) dataset and show that multi-layer global average pooling (multi-GAP) representations from EfficientNet-B4 model trained using part-to-whole pretext task, beats the current state-of-the-art (SOTA) methods on single-view material classification in limited labeled data settings.

Tags:

International Conference on Image Processing

IEEE ICIP 2022

icip

On Monocular Depth Estimation and Uncertainty Quantification Using Classification Approaches For Regression

Xuanlong Yu, Gianni Franchi, Emanuel Aldea

Value-Added Bundle(s) Including this Product

ICIP 2022, October 16-19, 2022, Bordeaux, France - Presentation Videos Product Bundle

More Like This

Predicting Path Loss Distributions of A Wireless Communication System For Multiple Base Station Altitudes From Satellite Images

Convolutional Neural Tree For Video-Based Facial Expression Recognition Embedding Emotion Wheel As inductive Bias

Gradient-Based Severity Labeling For Biomarker Classification in Oct

Join the IEEE Signal Processing Society