A Comparison Of Convolutional Neural Networks For Glottal Closure Instant Detection From Raw Speech
Jindrich Matousek, Daniel Tihelka
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:06:44
In this paper, we continue to investigate the use of machine learning for the automatic detection of glottal closure instants (GCIs) from raw speech. We compare several deep one-dimensional convolutional neural network architectures on the same data and show that the InceptionV3 model yields the best results on the test set. On publicly available databases, the proposed 1D InceptionV3 outperforms XGBoost, a non-deep machine learning model, as well as other traditional GCI detection algorithms.
Chairs:
Torbjørn Svendsen