DISTRIBUTION LEARNING FOR AGE ESTIMATION FROM SPEECH
Amruta Saraf, Elie Khoury
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:12:34
Age estimation from speech is becoming important with increasing usage of the voice channel. Call centers can use age estimates to influence call routing or to provide security by comparison with the speaker?s age on-file. Voice assistants can use it for parental control applications. The problem of age estimation from speech has been often viewed as a regression or classification problem. However these methods do not explicitly incorporate ordinal ranking or uncertainty in age estimation that humans often do. In this work, we hypothesize that the age follows a normal distribution centered around the real age with a particular confidence interval. We investigate three different distribution learning losses, namely KL divergence, GJM distance and mean-and-variance loss. Cross-dataset experiments were conducted on the NIST SRE08/10 and AgeVoxCeleb data, and their results show that the distribution learning methods are very competitive and in most cases better than traditional approaches.