SEMI-SUPERVISED NEURAL CHORD ESTIMATION BASED ON A VARIATIONAL AUTOENCODER WITH LATENT CHORD LABELS AND FEATURES
Yiming Wu, Eita Nakamura, Kazuyoshi Yoshii, Tristan Carsault
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:11:15
This paper describes a statistically-principled semi-supervised method of automatic chord estimation (ACE) that can make effective use of music signals regardless of the availability of chord annotations. The typical approach to ACE is to train a deep classification model in a supervised manner by using only annotated music signals. In this discriminative approach, prior knowledge about chord label sequences has scarcely been taken into account. In contrast, we propose a unified generative and discriminative approach in the framework of amortized variational inference. More specifically, we formulate a deep generative model that represents the generative process of chroma vectors from discrete labels and continuous features, which are assumed to follow a Markov model favoring self-transitions and a standard Gaussian distribution, respectively. Given chroma vectors as observed data, the posterior distributions of the latent labels and features are computed approximately by using deep classification and recognition models, respectively. These three models form a variational autoencoder and can be trained jointly in a semi-supervised manner. The experimental results show that the regularization of the classification model based on the Markov prior of chord labels and the generative model of chroma vectors improved the performance of ACE even under the supervised condition.