Face Aggregation Network For Video Face Recognition
Stefan H??rmann, Zhenxiang Cao, Martin Knoche, Fabian Herzog, Gerhard Rigoll
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:09:43
Typical approaches for video face recognition aggregate faces in a feature space to obtain a single feature representing the entire video. Unlike most previous approaches, we aggregate the faces directly in order to additionally obtain a single representative face as an intermediate output, from which a more discriminative feature vector is extracted. To overcome the limitation of a fixed number of input images of the state of the art in face aggregation, we incorporate a permutation invariant U-Net architecture capable of processing an arbitrary number of frames, which is employed in a generative adversarial network. We demonstrate the effectiveness of our method on three popular benchmark datasets for video face recognition. Our approach outperforms the baselines on the YouTube Faces dataset, obtaining an accuracy of 96.62%. Besides, we show that our method is robust against motion blur.