REPETITION COUNTING FROM COMPRESSED VIDEOS USING SPARSE RESIDUAL SIMILARITY
Rishabh Khurana (Samsung Research, Bangalore); Jayesh Rajkumar Vachhani (Samsung R&D Institute Bengaluru); Sourabh Vasant Gothe (SAMSUNG R&D INSTITUTE BANGALORE, KARNATAKA, INDIA); Pranay Kashyap (Samsung Research Institute Bangalore)
-
SPS
IEEE Members: $11.00
Non-members: $15.00
It is common for modern video codecs to reach triple digit compression ratios, which clearly shows the information redundancy and low information density of the ubiquitous RGB frame video representation. We propose an approach that directly utilizes the components of a compressed video for predicting the count of a repeating action occurring in the video. This complete bypassing of the video decoding step offers significant computational benefits. Furthermore, by leveraging intelligent single I-frame encodings and the sparse nature of accumulated residual vectors, we are able to efficiently capture the frame features even with lightweight feature extraction backbones. On the Countix dataset, our method achieves a considerable 91.5% reduction in model size and 91% reduction in FLOPS, with competitive results compared to the state-of-the-art.