BHE-DARTS: Bilevel Optimization based on Hypergradient Estimation for Differentiable Architecture Search
Zicheng Cai (Guangdong University of Technology); Lei Chen (Guangdong University of Technology); Hai-Lin Liu (Guangdong University of Technology)
-
SPS
IEEE Members: $11.00
Non-members: $15.00
In this paper, we propose a stochastic bilevel optimization approach based on a hypergradient estimator, called BHE-DARTS, as a remedy for this issue that it is easy to search for locally optimal structures rather than globally optimal ones in Differentiable Architecture Search (DARTS) bilevel optimization model. To be specific, we apply a stochastic gradient for updating the lower level variable $\omega$ and design a hypergradient estimator, which is built by the Jacobian- and Hessian-vector product, to assist in updating the upper level variable $\alpha$. This operation can more fully apply the gradient information to escape the trap of local optimal in the NAS bilevel model.
Compared to state-of-the-art DARTS methods, experimental studies have shown the competitive performance of the proposed BHE-DARTS in the DARTS search space (CIFAR-100: a test accuracy rate of 82.69%) and NAS-Bench-201 search space (ImageNet16-120: a test accuracy rate of 42.44%).