MAID: A Conditional Diffusion Model For Long Music Audio Inpainting
Kaiyang Liu (Sichuan university); Wendong Gan (Wiz Holdings Pte Ltd); Chenchen Yuan (Sichuan university)
-
SPS
IEEE Members: $11.00
Non-members: $15.00
Recent works on long music audio inpainting have focused on unconditionally generating new segments to fill corrupted audio segments. However, the information about these segments may differ significantly from the original. To solve this problem, we propose MAID (Music Audio Inpainting DDPM), a model for music audio inpainting based on DDPM (Denoising Diffusion Probability Model). The model is capable of unconditional and conditional inpainting of music audio: (a) in the unconditional inpainting task, MAID is capable of inpainting gaps with a length between 200 ms and 1600 ms; (b) in the conditional inpainting task, the model can generate new segments similar to the original segments based on the piano-rolls corresponding to the gaps. Experiments show that MAID performs better than baseline.