Stochastic Implicit Natural Gradient for Black-box Optimization
Black-box optimization is primarily important for many compute-intensive applications, including reinforcement learning (RL), robot control, etc. This paper presents a novel theoretical framework for black-box optimization, in which our method performs stochastic update within a trust region defined with KL-divergence. We show that this update is equivalent to a natural gradient step w.r.t. natural parameters of an exponential-family distribution. Theoretically, we prove the convergence rate of our framework for convex functions. Our theoretical results also hold for non-differentiable black-box functions. Empirically, our method achieves superior performance compared with the state-of-the-art method CMA-ES on separable benchmark test problems.
READ FULL TEXT