Analysis of Video Quality induced Spatio-Temporal Saliency Shifts

Xinbo Wu, Zhengyan Dong, Fan Zhang, Paul L. Rosin, Hantao Liu

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:09:42

18 Oct 2022

in multi-task learning (MTL) for visual scene understanding, it is crucial to transfer useful information between multiple tasks with minimal interferences. in this paper, we propose a novel architecture that effectively transfers informative features by applying the attention mechanism to the multi-scale features of the tasks. Since applying the attention module directly to all possible features in terms of scale and task requires a high complexity, we propose to apply the attention module sequentially for the task and scale. The cross-task attention module (CTAM) is first applied to facilitate the exchange of relevant information between the multiple task features of the same scale. The cross-scale attention module (CSAM) then aggregates useful information from feature maps at different resolutions in the same task. Also, we attempt to capture long range dependencies through the self-attention module in the feature extraction network. Extensive experiments demonstrate that our method achieves state-of-the-art performance on the NYUD-v2 and PASCAL-Context dataset. Our code will be publicly available later.

Tags:

International Conference on Image Processing

IEEE ICIP 2022

icip

Analysis of Video Quality induced Spatio-Temporal Saliency Shifts

Xinbo Wu, Zhengyan Dong, Fan Zhang, Paul L. Rosin, Hantao Liu

Value-Added Bundle(s) Including this Product

ICIP 2022, October 16-19, 2022, Bordeaux, France - Presentation Videos Product Bundle

More Like This

Earthquake Location and Magnitude Estimation With Graph Neural Networks

Automating Detection of Papilledema in Pediatric Fundus Images With Explainable Machine Learning

Revisiting The Efficiency of Ugc Video Quality Assessment

Join the IEEE Signal Processing Society