Training Large-Vocabulary Neural Language Models by Private Federated Learning for Resource-Constrained Devices

Mingbin Xu (Apple); Congzheng Song (Apple); Ye Tian (Apple); Neha Agrawal (Apple); Filip Granqvist (Apple); Rogier C van Dalen (Samsung AI Center, Cambridge, UK); Xiao Zhang (Apple); Arturo Argueta (Apple); Shiyi Han (Apple); Yaqiao Deng (Apple); Leo Liu (Apple); Anmol Walia (Apple); Alex Jin (Apple)

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

06 Jun 2023

Federated Learning (FL) is a technique to train models on distributed edge devices with local data samples. Differential Privacy (DP) can be applied with FL to provide a formal privacy guarantee for the sensitive data on device. Our goal is to train a large neural network language model (NNLM) on compute-constrained devices while preserving privacy using FL and DP. However, the DP-noise introduced to the model increases as the model size grows, which often prevents convergence. We propose Partial Embedding Updates (PEU), a novel technique to reduce the impact of DP-noise by decreasing payload size. Furthermore, we adopt Low Rank Adaptation (LoRA) and Noise Contrastive Estimation (NCE) to reduce the memory demands of large models on compute-constrained devices. We demonstrate in simulation and with real devices that this combination of techniques makes it possible to train large-vocabulary language models while preserving accuracy and privacy.

Tags:

language modeling

Training Large-Vocabulary Neural Language Models by Private Federated Learning for Resource-Constrained Devices

Value-Added Bundle(s) Including this Product

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

Large-Scale and Parameter-Efficient Language Modeling for Speech Processing

HAG: Hierarchical Attention with Graph Network for Dialogue Act Classification in Conversation

Enhancing Unsupervised Speech Recognition with Diffusion GANs

Join the IEEE Signal Processing Society