Near-Optimal Fully First-Order Algorithms for Finding Stationary Points in Bilevel Optimization

06/26/2023

∙

Bilevel optimization has various applications such as hyper-parameter optimization and meta-learning. Designing theoretically efficient algorithms for bilevel optimization is more challenging than standard optimization because the lower-level problem defines the feasibility set implicitly via another optimization problem. One tractable case is when the lower-level problem permits strong convexity. Recent works show that second-order methods can provably converge to an ϵ-first-order stationary point of the problem at a rate of 𝒪̃(ϵ^-2), yet these algorithms require a Hessian-vector product oracle. Kwon et al. (2023) resolved the problem by proposing a first-order method that can achieve the same goal at a slower rate of 𝒪̃(ϵ^-3). In this work, we provide an improved analysis demonstrating that the first-order method can also find an ϵ-first-order stationary point within 𝒪̃(ϵ^-2) oracle complexity, which matches the upper bounds for second-order methods in the dependency on ϵ. Our analysis further leads to simple first-order algorithms that can achieve similar near-optimal rates in finding second-order stationary points and in distributed bilevel problems.

READ FULL TEXT

Near-Optimal Fully First-Order Algorithms for Finding Stationary Points in Bilevel Optimization

Sign in with Google

Consider DeepAI Pro