MEJIGCLU: MORE EFFECTIVE JIGSAW CLUSTERING FOR UNSUPERVISED VISUAL REPRESENTATION LEARNING
Yongsheng Zhang, Qing Liu, Yang Zhao, Yixiong Liang
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:08:58
Unsupervised visual representation learning aims to learn general feature from unlabelled data. Early methods design intra-image pretext tasks as learning targets and can be achieved with low computational overhead but unsatisfactory performance. Recent methods introduce contrastive learning and achieve surprising performance. But multiple views of a training data are required in one training batch, resulting high computational overhead. To achieve competitive results with contrastive learning with low computational overhead, we propose a new unsupervised representation learning method with jigsaw clustering and classification as pretext tasks. Our approach partitions each training image into patches, then randomly shuffles the patches and reconstructs them into new training data, and uses clustering branch and classification branch to motivate the network to learn discriminative feature. Compared with the current work, our method achieves state-of-the-art results on both ImageNet and COCO datasets.