Boosted Locality Sensitive Hashing: Discriminative, Efficient, and Scalable Binary Codes for Source Separation
Sunwoo Kim (Indiana University); Minje Kim (Indiana University)
-
SPS
IEEE Members: $11.00
Non-members: $15.00
We propose a novel adaptive boosting approach to learn discriminative binary hash codes, boosted locality sensitive hashing (BLSH), that can represent audio spectra efficiently. We aim to use the learned hash codes in the single-channel speech denoising task by designing a nearest neighborhood search method that operates in the hashed feature space. To achieve the optimal denoising results given the highly compact binary feature representation, our proposed BLSH algorithm learns simple logistic regressors as the weak learners in an incremental way (i.e., one by one) so that each weak learner is trained to complement the mistake its predecessors have made. Upon testing, their binary classification results transform each spectrum of noisy speech into a bit string, where the bits are ordered based on their significance, adding scalability to the denoising system. Simple bitwise operations calculate Hamming distance to find the K -nearest matching hashed frames in the dictionary of training noisy speech spectra, whose associated ideal binary masks are averaged to estimate the denoising mask for that test mixture. In contrast to the locality sensitive hashing method's random projections, our proposed supervised learning algorithm trains the projections such that the distance between the self-similarity matrix of the hash codes and that of the original spectra is minimized. Likewise, the process conceptually aligns to the Adaboost algorithm, although ours is specialized in learning binary features for source separation rather than classification. Experimental results on speech denoising suggest that the BLSH algorithm learns more discriminative representations than Fourier or mel spectra and the nonlinear kernels derived from them. Our compact binary representation is expected to facilitate model deployment onto resource-constrained environments, where comprehensive models (e.g., deep neural networks) are unaffordable.