Skip to main content
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 00:14:46
13 May 2022

This paper proposes a high-speed neural vocoder for CPU implementation. Two approaches for speeding up autoregressive neural vocoders have been proposed, 1) simultaneous multiple sample generation and 2) subband signal-based vocoder; so far they have been employed independently. Our neural vocoder is extremely fast as it generates multiple samples of subband signals simultaneously. Although there is an association between each subband signal, the conventional subband-based vocoder can degrade quality because each subband signal is generated from an independent probability distribution. To overcome this problem, we also introduce waveform generation that takes account of the association of each subband by employing multivariate Gaussian. Experiments show that 1) our proposed method is 1.81 times as fast as the conventional subband WaveRNN on a single-threaded CPU; 2) it outperformed the conventional method in a subjective evaluation in terms of naturalness, and achieved a mean opinion score (MOS) of 4.08 on text-to-speech task.

More Like This

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00