A Zeroth-Order Learning Algorithm For Ergodic Optimization Of Wireless Systems With No Models And No Gradients
Dionysios Kalogerias, Mark Eisen, George Pappas, Alejandro Ribeiro
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 15:00
Optimal resource allocation in real-world wireless systems is rather challenging, not only due to the unavailability of accurate statistical channel models, but also because expressions of maximal or achievable information rates are most often unknown, or not adequately precise. Under a modular stochastic functional optimization framework, we propose a new zeroth-order stochastic primal-dual algorithm for completely data-driven, model-free and gradient-free learning of optimal resource allocation policies for ergodic network optimization. Our contribution relies on Gaussian smoothing of the corresponding constrained policy search problem, and on the representation power of universal policy parameterizations, such as Deep Neural Networks (DNNs). Indeed, our simulations demonstrate that DNN-based policies produced by the proposed primal-dual method attain near-ideal performance, based exclusively on limited channel probing, completely bypassing the need for gradient computations, and at the absence of channel or information rate models.