Non-ergodic linear convergence property of the delayed gradient descent under the strongly convexity and the Polyak-Łojasiewicz condition

08/23/2023
by   Hyung Jun Choi, et al.
0

In this work, we establish the linear convergence estimate for the gradient descent involving the delay τ∈ℕ when the cost function is μ-strongly convex and L-smooth. This result improves upon the well-known estimates in Arjevani et al. <cit.> and Stich-Karmireddy <cit.> in the sense that it is non-ergodic and is still established in spite of weaker constraint of cost function. Also, the range of learning rate η can be extended from η≤ 1/(10Lτ) to η≤ 1/(4Lτ) for τ =1 and η≤ 3/(10Lτ) for τ≥ 2, where L >0 is the Lipschitz continuity constant of the gradient of cost function. In a further research, we show the linear convergence of cost function under the Polyak-Łojasiewicz (PL) condition, for which the available choice of learning rate is further improved as η≤ 9/(10Lτ) for the large delay τ. Finally, some numerical experiments are provided in order to confirm the reliability of the analyzed results.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset