Batch Normalization damages Federated Learning on Non-IID data: Analysis and Remedy

Yanmeng Wang (The Chinese University of Hong Kong, Shenzhen); Qingjiang Shi (Tongji University); Tsung-Hui Chang ("The Chinese University of Hong Kong,")

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

06 Jun 2023

Batch normalization (BN) has been widely used for accelerating the training of deep neural networks. However, recent findings show that, in the federated learning (FL) scenarios, BN can damage the learning performance when the clients have non-i.i.d. data. While several FL schemes have been proposed to address this issue, they still suffer a significant performance loss compared to the centralized scheme. In addition, none of them have explained how the BN impacts the FL convergence analytically. In this paper, we present the first convergence analysis to show that the mismatched local and global statistical parameters due to non-i.i.d data cause gradient deviation and it leads the algorithm to converge to a biased solution with a slower rate. To remedy this, we further present a new FL algorithm, called FedTAN, based on an iterative layer-wise parameter aggregation procedure. Experiment results are presented to show the superiority of FedTAN.

Tags:

Bounds on performance

Batch Normalization damages Federated Learning on Non-IID data: Analysis and Remedy

Yanmeng Wang (The Chinese University of Hong Kong, Shenzhen); Qingjiang Shi (Tongji University); Tsung-Hui Chang ("The Chinese University of Hong Kong,")

Value-Added Bundle(s) Including this Product

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

Asymptotic Distribution of Stochastic Mirror Descent Iterates in Average Ensemble Models

On weighted cross-entropy for label-imbalanced separable data: An algorithmic-stability study

On the Value of Stochastic Side Information in Online Learning

Join the IEEE Signal Processing Society