CONSTANT Q CEPSTRAL COEFFICIENTS FOR CLASSIFICATION OF NORMAL VS. PATHOLOGICAL INFANT CRY
Hemant A. Patil, Ankur T. Patil, Aastha Kachhi
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:10:16
Classification of normal vs. pathological infant cry is an inter- esting and technologically challenging research problem due to quasi-periodic sampling of vocal tract spectrum by high pitch-source harmonics resulting in extremely poor spectral resolution for commonly used spectral features, such as Mel Frequency Cepstral Coefficients (MFCC). To that effect, in this paper, we propose a new approach of feature extraction based on Constant Q Transform (CQT) that is known to have variable spectro-temporal resolution w.r.t Heisen berg?s uncertainty principle in signal processing framework. Further, CQT is also known to preserve form-invariance property (than it?s Short-Time Fourier Transform (STFT) counterpart)-a desirable attribute of feature descriptors to be invariant w.r.t shape, shift, rotation, and scaling. CQT- based features are then transformed to the cepstral-domain to derive Constant Q Cepstral Coefficients (CQCC), which are then fed to statistical and discriminative classifiers, namely, Gaussian Mixture Model (GMM) and Support Vector Machine (SVM) respectively. CQCC-GMM and CQCC-SVM systems gave relatively better results than MFCC for various experimental evaluation factors for infant cry classification task on widely used and statistically meaningful Baby Chilanto Database. Relatively best performance, in particular, 99.82% accuracy (0.44% EER), is observed for CQCC-GMM system.