-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:13:15
Restoring Arabic syntactic diacritics based on Long Short-Term Memory (LSTM) networks leads to state-of-the-art performance. These LSTM networks are commonly augmented with Maximum Entropy (MaxEnt) sparse direct connections between the input and the output layers of the tagger. One way to improve such tagger performance is to use an ensemble of taggers. However, an ensemble of taggers may require huge computational and memory resources. In this paper, we implement a knowledge distillation technique where an ensemble of teachers/taggers is used to train a single student tagger. On the other hand, Arabic is a morphologically rich language and has a high Out-Of-Vocabulary (OOV) rate. In addition to word embeddings, we propose to use character embeddings encoded using LSTMs for each word to overcome this problem. On the Arabic tree bank task, our hybrid LSTM/MaxEnt tagger achieves 1.0% absolute WER improvement over a strong baseline using the proposed two techniques.
Chairs:
Eric Fosler-Lussier