Online Model Compression for Federated Learning with Large Models

Tien-Ju Yang (Google); Yonghui Xiao (Google); Giovanni Motta (Google, Inc.); Françoise Beaufays (Google); Rajiv Mathews (Google); Mingqing Chen (Google Inc.)

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

07 Jun 2023

This paper addresses the challenges of training large neural networks under federated learning settings: high on-device memory usage and communication cost. The proposed Online Model Compression (OMC) provides a framework that stores model parameters in a compressed format and decompresses them only when needed. We use quantization as the compression method in this paper and propose three methods, (1) per-variable transformation, (2) weight-matrix-only quantization, and (3) partial variable quantization, to minimize its impact on model accuracy. Our experiments on two recent neural networks for speech recognition and two different datasets show that OMC can reduce memory usage and communication cost of model parameters by up to 59% while attaining comparable accuracy and training speed when compared with full-precision federated learning.

Tags:

Deep learning techniques

Online Model Compression for Federated Learning with Large Models

Tien-Ju Yang (Google); Yonghui Xiao (Google); Giovanni Motta (Google, Inc.); Françoise Beaufays (Google); Rajiv Mathews (Google); Mingqing Chen (Google Inc.)

Value-Added Bundle(s) Including this Product

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

Training Robust Spiking Neural Networks with ViewPoint Transform and SpatioTemporal Stretching

Adaptive Scale and Spatial Aggregation for Real-time Object Detection

Receptive Field Reliant Zero-Cost Proxies for Neural Architecture Search

Join the IEEE Signal Processing Society