Skip to main content

Efficient Adversarial Audio Synthesis Via Progressive Upsampling

Youngwoo Cho, Minwook Chang, Sanghyeon Lee, Hyoungwoo Lee, Gerard Jounghyun Kim, Jaegul Choo

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 00:11:12
09 Jun 2021

This paper proposes a novel generative model called \toolname, which progressively synthesizes high-quality audio in raw-waveform. Progressive upsampling GAN (PUGAN) leverages the previous idea of the progressive generation of higher-resolution output by stacking multiple encoder-decoder architectures. Compared to the existing state-of-the-art model called WaveGAN, which uses a single decoder architecture, our model generates audio signals and converts them to a higher resolution in a progressive manner, while using a significantly smaller number of parameters, e.g., 3.17x smaller for 16 kHz output, than the WaveGAN. Our experiments show that the audio signals can be generated in real-time with comparable quality to that of WaveGAN with respect to the inception scores and human perception.

Chairs:
Sven Shepstone

Value-Added Bundle(s) Including this Product

More Like This

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00