Deep Reinforcement Learning with Symmetric Prior for Predictive Power Allocation to Mobile Users

02/10/2021
by   Jianyu Zhao, et al.
0

Deep reinforcement learning has been applied for a variety of wireless tasks, which is however known with high training and inference complexity. In this paper, we resort to deep deterministic policy gradient (DDPG) algorithm to optimize predictive power allocation among K mobile users requesting video streaming, which minimizes the energy consumption of the network under the no-stalling constraint of each user. To reduce the sampling complexity and model size of the DDPG, we exploit a kind of symmetric prior inherent in the actor and critic networks: permutation invariant and equivariant properties, to design the neural networks. Our analysis shows that the free model parameters of the DDPG can be compressed by 2/K^2. Simulation results demonstrate that the episodes required by the learning model with the symmetric prior to achieve the same performance as the vanilla policy reduces by about one third when K = 10.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset