HIFI-SVC: FAST HIGH FIDELITY CROSS-DOMAIN SINGING VOICE CONVERSION
Yong Zhou, Xiangju Lu
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:14:49
This paper presents HiFi-SVC, a small cross-domain singing voice conversion model for generating high-fidelity 22.05 kHz singing voices. Building on state-of-the-art neural vocoder HiFi-GAN and a convolution-based module for modeling F0, HiFi-SVC can be trained end-to-end with either speech or singing data, achieving better voice similarity on two of the datasets than FastSVC while using slightly smaller number of parameters. We also propose a pitch adjustment method for improving conversion quality.