Skip to main content
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 17:20
04 May 2020

Music is a universally-enjoyed art form, but listeners often respond to it in tremendously different ways. The same song can bring one person great joy and another deep sorrow. This paper focuses on modeling human music experience at the group level. In this scenario, human annotations serve an important role in computational modeling, especially where the target constructs under study are hidden, such as dimensions of emotion or enjoyment to music listening. In this work, we investigate several ways to represent aggregate human annotations of the complex, subjective emotional experience of listening to music. We show the utility of several methods for fusing self-reported emotion and enjoyment ratings by predicting these responses with auditory features. Using traditional methods such as time alignment with simple averaging and Dynamic Time Warping, as well as state-of-the-art methods based on Expectation Maximization and Triplet Embeddings, we show that it is possible to accurately represent hidden constructs in time under noisy sampling conditions, evidenced by better performance on behavioral response predictions. That subjective responses to complex musical stimuli can be accurately captured using these methods suggests more general applications to research in areas such as affective computing and music perception.

Value-Added Bundle(s) Including this Product

More Like This

  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00