Skip to main content
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 12:01
27 Oct 2020

One of the key challenges in no-reference video quality assessment (NR-VQA) is the absence of the reference video to measure the similarity or difference between the distorted video and the original one. In this paper, an encoder-decoder model is proposed to predict pixel-by-pixel similarity maps from the distorted video. The model takes multiple frames as input since correlated pixels of adjacent frames can be exploited to recover the similarity map of the middle frame of the distorted video clip. In addition, to further exploit the temporal perception mechanism of the human visual system (HVS), which is relevant to the perceptual video distortion measurement, visual persistence and temporal memory effects are considered in the spatio-temporal pooling network design. Experimental results demonstrate that our proposed method outperforms state-of-the-art NR-VQA metrics.

Value-Added Bundle(s) Including this Product

More Like This

  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00