Bio-Mimetic Attentional Feedback In Music Source Separation
Ashwin Bellur, Mounya Elhilali
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 28:50
Attention plays a vital role in sifting through the cacophony of sounds in everyday scenes by emphasizing the representation of targets sounds relative to distractors. While its conceptual role is well established, there are competing theories as to how attentional feedback operates in the brain and how its mechanistic underpinnings can be incorporated into computational systems. These interpretations differ in the manner in which attentional feedback operates as an information bottleneck to aid perception. One interpretation is that attention adapts the sensory mapping itself to encode only the target cues. An alternative interpretation is that attention behaves as a gain modulator that enhances the target cues after they are encoded. Further, the theory of temporal coherence states that attention seeks to bind temporally coherent features relative to anchor features as determined by prior knowledge of target objects. In this work, we study these competing theories within a deep-network framework for the task of music source separation. We show that these theories complement each other, and when employed, together yield state of the art performance. We further show that systems with attentional mechanisms can be made to scale to mismatched conditions by retuning only the attentional modules with minimal data.