An Unsupervised Cross-Modal Hashing Method Robust To Noisy Training Image-Text Correspondences in Remote Sensing
Georgii Mikriukov, Mahdyar Ravanbakhsh, Begüm Demir
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:14:58
UGC video quality assessment (UGC-VQA) is a challenging research topic due to the high video diversity and limited public UGC quality datasets. State-of-the-art (SOTA) UGC quality models tend to use high complexity models, and rarely discuss the trade-off among complexity, accuracy, and generalizability. We propose a new perspective on UGC-VQA, and show that model complexity may not be critical to the performance, whereas a more diverse dataset is essential to train a better model. We illustrate this by using a light weight model, UVQ-lite, which has higher efficiency and better generalizability (less overfitting) than baseline SOTA models. We also propose a new way to analyze the sufficiency of the training set, by leveraging UVQ's comprehensive features. Our results motivate a new perspective about the future of UGC-VQA research, which we believe is headed toward more efficient models and more diverse datasets.