Anytime Minibatch With Delayed Gradients: System Performance And Convergence Analysis
Haider Al-Lawati, Stark Draper
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 14:12
We present convergence analysis of Anytime Minibatch with Delayed Gradients (AMB-DG) algorithm. In AMB-DG, workers compute gradients in epochs of fixed duration while the master uses stale gradients to update the optimization parameters. We analyze AMB-DG in terms of its regret bound and convergence rate. We present results for convex smooth objective functions which show that AMB-DG achieves the optimal regret bound and convergence rate. To complement our theoretical contribution, we deploy AMB-DG on SciNet, an academic high-performance cloud computing platform, and compare its performance with that of the $K$-batch async scheme. $K$-batch async provides a baseline for schemes that exploit works completed by all workers while using stale gradients. In our experiments, for MNIST AMB-DG converges $2.45$ times faster than K-batch async.