HIFI-SVC: FAST HIGH FIDELITY CROSS-DOMAIN SINGING VOICE CONVERSION

Yong Zhou, Xiangju Lu

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:14:49

09 May 2022

This paper presents HiFi-SVC, a small cross-domain singing voice conversion model for generating high-fidelity 22.05 kHz singing voices. Building on state-of-the-art neural vocoder HiFi-GAN and a convolution-based module for modeling F0, HiFi-SVC can be trained end-to-end with either speech or singing data, achieving better voice similarity on two of the datasets than FastSVC while using slightly smaller number of parameters. We also propose a pitch adjustment method for improving conversion quality.

Tags:

phonetic posteriorgrams

pitch modelling

singing voice conversion