HPFTN: Hierarchical Progressive Fusion Transformer Network for Video Denoising
Shuaitao Zhang (Hikvision Research Institute); Yuan Zhang (Hikvision Research Institute); Zheng Zhao (Hikvision Research Institute); Di Xie (Hikvision Research Institute); Shiliang Pu (Hikvision Research Institute)
-
SPS
IEEE Members: $11.00
Non-members: $15.00
This paper presents a simple yet effective approach to modeling space-time correspondences in the context of video denoising. Unlike most existing approaches, our method, namely HPFTN, can operate end-to-end on consecutive
frames without motion estimation. To do so, the proposed hierarchical patch matching module uses a multiple scales correspondence matching scheme to effectively build correspondences between neighbor frames and the current frame,
lowering the computational cost. The progressive feature fusion module further enhances the current frame representation ability by extensively exploiting spatial-temporal correlations from multiple frames on patch level. Finally, the pyramid
transformer reconstruction module efficiently leverages both high-level semantic and low-level fine-grained detailed features to predict clean video frames. Extensive quantitative and qualitative experiments validate the effectiveness of our
proposed model. Our source code will be released.