Time Difference Of Arrival Estimation From Frequency-Sliding Generalized Cross-Correlations Using Convolutional Neural Networks
Luca Comanducci, Fabio Antonacci, Augusto Sarti, Maximo Cobos
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 23:49
The interest in deep learning methods for solving traditional signal processing tasks has been steadily growing in the last years. Time delay estimation (TDE) in adverse scenarios is a challenging problem, where classical approaches based on generalized cross-correlations (GCCs) have been widely used for decades. Recently, the frequency-sliding GCC (FS-GCC) was proposed as a novel technique for TDE based on a sub-band analysis of the cross-power spectrum phase, providing a structured two-dimensional representation of the time delay information contained across different frequency bands. Inspired by deep-learning-based image denoising solutions, we propose in this paper the use of convolutional neural networks (CNNs) to learn the time-delay patterns contained in FS-GCCs extracted in adverse acoustic conditions. Our experiments confirm that the proposed approach provides excellent TDE performance while being able to generalize to different room and sensor setups.