Skip to main content

Locale Encoding for scalable multilingual keyword spotting models

Pai Zhu (Google); Hyun Jin Park (Google Inc.); Alex Park (Google); Angelo Scorza Scarpati (Google); Ignacio Lopez Moreno (Google)

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
06 Jun 2023

A Multilingual Keyword Spotting (KWS) system detects spoken keywords over multiple locales. Conventional monolingual KWS approaches do not scale to multilingual cases well because of high development/maintenance cost and lack of resource sharing. To overcome this limit, we propose two locale-conditioned universal models with locale feature concatenation and feature-wise linear modulation (FiLM). We compare these models with two baseline methods: locale-specific monolingual KWS, and a single universal model trained over all data. Experiments over 10 localized language datasets show that locale-conditioned models substantially improve accuracy over baseline methods across all locales in different noise conditions. FiLM performed the best, improving on average FRR by 61% (relative) over similar sized monolingual KWS models.

More Like This

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00