Skip to main content

LeanSpeech: The Microsoft Lightweight Speech Synthesis System for LIMMITS Challenge 2023

Chen Zhang (Microsoft); SHUBHAM BANSAL (Microsoft); Aakash Lakhera (Microsoft); Jinzhu Li (Microsoft); Gag Wang (Microsoft); Sandeep kumar Satpal (Microsoft,India); sheng zhao (microsoft); Lei He (Microsoft Cloud and AI)

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
10 Jun 2023

This paper describes the Microsoft Text-to-Speech (TTS) system: LeanSpeech for LIMMITS (Lightweight, Multi-speaker, Multi-lingual Indic TTS) Challenge 2023, which is part of ICASSP2023 to encourage the advance of TTS in Indian Languages. We propose a lightweight encoder-decoder acoustic model composed of 1-D convolution and LSTM blocks, which is trained with knowledge distillation from a multi-speaker multi-lingual teacher model, DelightfulTTS. The speech corpus is reprocessed and used in both AM training and vocoder fine-tuning. In Track-2 of the challenge, our system achieves MOS 4.56 and SMOS 3.98, which indicates the efficiency of the proposed model and training strategy.

More Like This

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00