Rain-Prior injected Knowledge Distillation For Single Image Deraining
Yuzhang Hu, Wenhan Yang, Jiaying Liu, Zongming Guo
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:09:58
Video instance segmentation (VIS) is a hybrid task that requires detecting, segmenting, and tracking instances belonging to the known class set in videos. in real-world applications, video instance segmentation faces an urgent need to fast adapt the model to handle novel-class instances with a few labeled videos. in this work, we aim to tackle the task of few-shot video instance segmentation (FVIS), which is challenging due to large variations in object appearance and motion. We propose a robust temporally coherent strategy (VTFA) based on a two-stage fine-tuning approach. VTFA enables the video instance segmentation for novel classes to be temporally smooth and reduces the classification bias between novel and base classes. The proposed Memory-aware Temporal Context Encoding Module (MTCE) in VTFA encodes the temporal context information, so that contributes to the consistency in the final predictions. We also propose a loss named instance-level Pair-wise Contrastive Loss (IPC Loss) to work on both the novel and base classes to enhance the robustness for instance classification. To validate our method, we develop a YouTube-VIS-FS benchmark to compare our method with several baselines. The experimental evaluation shows that our strategy is superior or competitive to those strong baselines.