Noise-Robust Key-Phrase Detectors For Automated Classroom Feedback
Brian Zylich, Jacob Whitehill
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 12:59
With the goal of giving teachers automated feedback about their classrooms, we investigate how to train automatic speech detectors of key phrases such as good job, thank you, please, and you're welcome. This kind of language conveys support and respect from teacher to student and is one of the behavioral markers used in the established CLASS [1] classroom observation protocol. School classrooms are noisy and contain overlapping speech, presenting a highly challenging environment for automatic speech recognition (ASR), even for state-of-the-art approaches. We train deep neural networks using hierarchical multitask learning (MTL) on a modest-sized but highly-tailored dataset of classroom speech. Compared to 2 state-of-the-art ASR systems for general-purpose speech recognition (Google [2] and DeepSpeech [3]), our system delivers a substantially improved recall rate (50.4% versus 20.5%) while matching their precision (30%). Moreover, our system's predictions correlate with several dimensions of the CLASS.