Sample and Communication-Efficient Decentralized Actor-Critic Algorithms with Finite-Time Analysis

09/08/2021
by   Ziyi Chen, et al.
0

Actor-critic (AC) algorithms have been widely adopted in decentralized multi-agent systems to learn the optimal joint control policy. However, existing decentralized AC algorithms either do not preserve the privacy of agents or are not sample and communication-efficient. In this work, we develop two decentralized AC and natural AC (NAC) algorithms that are private, and sample and communication-efficient. In both algorithms, agents share noisy information to preserve privacy and adopt mini-batch updates to improve sample and communication efficiency. Particularly for decentralized NAC, we develop a decentralized Markovian SGD algorithm with an adaptive mini-batch size to efficiently compute the natural policy gradient. Under Markovian sampling and linear function approximation, we prove the proposed decentralized AC and NAC algorithms achieve the state-of-the-art sample complexities 𝒪(ϵ^-2ln(ϵ^-1)) and 𝒪(ϵ^-3ln(ϵ^-1)), respectively, and the same small communication complexity 𝒪(ϵ^-1ln(ϵ^-1)). Numerical experiments demonstrate that the proposed algorithms achieve lower sample and communication complexities than the existing decentralized AC algorithm.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset