Skip to main content

Controllable music inpainting with mixed-level and disentangled representation

Shiqi Wei (Fudan University); Ziyu Wang (NYU Shanghai); Weiguo Gao (Fudan University); Gus Xia (New York University Shanghai)

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
06 Jun 2023

Music inpainting, which is to complete the missing part of a piece given some context, is an important task of automated music generation. In this study, we contribute a controllable inpainting model by combining the high expressivity of mixed-length, disentangled music representations and the strong predictive power of masked language modeling. The model enables flexible user controls over both time scope (inpainted length and location) and semantic features that composers often consider during composition, including pitch contour, rhythm pattern, and chords. The key model design is to simultaneously predict disentangled representations of different time ranges. Such design aims to mirror the thought process of a professional composer who can take into account of the music flow of various semantic features at different hierarchies in parallel. Results show that our model produces much higher quality music compared to the baseline, and the subjective evaluation shows that our model generates much better results than the baseline and can generate melodies that are similar to human composition.

More Like This

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00