Binauralization Robust to Camera Rotation Using 360° Videos
Masaki Yoshida (Hokkaido University); Ren Togo (Hokkaido University); Takahiro Ogawa (Hokkaido University); Miki Haseyama (Hokkaido University)
-
SPS
IEEE Members: $11.00
Non-members: $15.00
We propose a novel binauralization method that is robust to camera rotation. Since binaural audio can bring a 3D sensation to the listener, it can enhance the immersive experience of the video. Researchers have been explored binaural audio generation from monaural audio to deepen the experience of already captured videos without special recording devices. However, this binauralization on real-world videos can be a challenging task due to camera rotation. Camera rotation makes it difficult to predict the exact sound source position due to the motion of the sound sources and the background in videos. To tackle this problem, we propose a training data generation pipeline using 360° videos for binauralization. We generate monocular videos and binaural audio with camera rotation from 360° videos for the training of binauralization. Additionally, we newly construct a binauralization framework that conducts multi-task learning with camera localization. The camera localization predicts the camera rotation and helps the binauralization. Experimental results show that our method can achieve the binauralization on videos with camera rotation.