Sesqa: Semi-Supervised Learning For Speech Quality Assessment
Joan Serrà, Jordi Pons, Santiago Pascual
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:06:46
Automatic speech quality assessment is an important, transversal task whose progress is hampered by the scarcity of human annotations, poor generalization to unseen recording conditions, and a lack of flexibility of existing approaches. In this work, we tackle these problems with a semi-supervised learning approach, combining available annotations with programmatically generated data, and using 3 different optimization criteria together with 5 complementary auxiliary tasks. Our results show that such a semi-supervised approach can cut the error of existing methods by more than 36%, while providing additional benefits in terms of reusable features or auxiliary outputs. Improvement is further corroborated with an out-of-sample test showing promising generalization capabilities.
Chairs:
Ina Kodrasi