Skip to main content

IAST: Instance Association Relying on Spatio-temporal Features for Video Instance Segmentation

Junhao Chen (Zhejiang University of Technology); Sheng Liu (Zhejiang University of Technology); ruixiang chen (Zhejiang University of Technology); BIngnan Guo (Zhejiang University of Technology); Feng Zhang (Zhejiang University of Technology)

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
06 Jun 2023

Most offline video instance segmentation (VIS) methods lack consideration for multi-scale spatio-temporal features, which leads to unstable instance association across frames. To address this problem, we propose IAST that builds Instance Association relying on Spatio-Temporal features for video instance segmentation. In detail, we design a novel Scale-to-Scale Attention Module in the encoder of IAST, which constructs stable cross-frame instance associations by efficiently leveraged multi-scale spatio-temporal features. In addition, we introduce a new data augmentation method called Sequential Copy-Paste, which effectively alleviates the overfitting problem caused by insufficient training data and enhances the robustness of the model. Empirically, IAST achieves the state-of-the-art VIS benchmarks with a ResNet-50 backbone: 47.4% AP, 41.6% AP on YouTube-VIS 2019 & 2021. Such achievements significantly outperform the previous state-of-the-art performance of 1.0% at the expense of fewer parameters. Code is available https://github.com/clozureyez/IAST.

More Like This

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00