Sharper Rates for Separable Minimax and Finite Sum Optimization via Primal-Dual Extragradient Methods
We design accelerated algorithms with improved rates for several fundamental classes of optimization problems. Our algorithms all build upon techniques related to the analysis of primal-dual extragradient methods via relative Lipschitzness proposed recently by [CST21]. (1) Separable minimax optimization. We study separable minimax optimization problems min_x max_y f(x) - g(y) + h(x, y), where f and g have smoothness and strong convexity parameters (L^x, μ^x), (L^y, μ^y), and h is convex-concave with a (Λ^xx, Λ^xy, Λ^yy)-blockwise operator norm bounded Hessian. We provide an algorithm with gradient query complexity Õ(√(L^x/μ^x) + √(L^y/μ^y) + Λ^xx/μ^x + Λ^xy/√(μ^xμ^y) + Λ^yy/μ^y). Notably, for convex-concave minimax problems with bilinear coupling (e.g. quadratics), where Λ^xx = Λ^yy = 0, our rate matches a lower bound of [ZHZ19]. (2) Finite sum optimization. We study finite sum optimization problems min_x 1/n∑_i∈[n] f_i(x), where each f_i is L_i-smooth and the overall problem is μ-strongly convex. We provide an algorithm with gradient query complexity Õ(n + ∑_i∈[n]√(L_i/nμ)). Notably, when the smoothness bounds {L_i}_i∈[n] are non-uniform, our rate improves upon accelerated SVRG [LMH15, FGKS15] and Katyusha [All17] by up to a √(n) factor. (3) Minimax finite sums. We generalize our algorithms for minimax and finite sum optimization to solve a natural family of minimax finite sum optimization problems at an accelerated rate, encapsulating both above results up to a logarithmic factor.
READ FULL TEXT