Skip to main content
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 15:10
04 May 2020

High-definition 360 videos encoded in fine quality are typically too large to stream in its entirety over bandwidth (BW)-constrained networks. One popular remedy is to extract and send a spatial sub-region corresponding to a viewer's current field-of-view (FoV) in a head-mounted display for more BW-efficient streaming. Due to non-negligible round-trip-time (RTT) delay between server and client, accurate head movement prediction foretelling a viewer's future FoVs is essential. Existing approaches are either overly simplistic in modelling and predict poorly when RTT is large, or over-reliant on data-driven learning, resulting in inflexible models that are not robust to RTT heterogeneity. In this paper, we cast the head movement prediction task as a sparse directed graph learning problem, where three sources of relevant information---a 360 image saliency map, collected viewers' head movement traces, and a biological head rotation model---are aggregated into a unified Markov model. Specifically, we formulate a constrained optimization problem to minimize an l_2-norm fidelity term and a sparsity term, corresponding to trace data / saliency consistency and a sparse graph model prior respectively. We solve the problem alternately using a hybrid IRLS and Frank-Wolfe optimization. Experiments show that our prediction scheme noticeably outperforms existing proposals across a wide range of RTTs.

Value-Added Bundle(s) Including this Product

More Like This

  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00