Tiyuntsong: A Self-Play Reinforcement Learning Approach for ABR Video Streaming
Existing reinforcement learning(RL)-based adaptive bitrate(ABR) approaches outperform the previous fixed control rules based methods by improving the Quality of Experience(QoE) score, while the QoE metric can hardly provide clear guidance for optimization, resulting in the unexpected strategies. In this paper, we propose Tiyuntsong, a self-play reinforcement learning approach with generative adversarial network(GAN)-based method for ABR video streaming. Tiyuntsong learns strategies automatically by training two agents who are competing against each other. Note that the competition results are evaluated with the rule rather than a numerical QoE score, and the rule has a clear optimization goal. Meanwhile, we propose GAN Enhancement Module to extract hidden features from the past status for preserving the information without the limitations of sequence lengths. Using testbed experiments, we show that the utilization of GAN significantly improves the Tiyuntsong's performance. By comparing the performance of ABRs, we observe that Tiyuntsong also betters existing ABR algorithms in the underlying metrics.
READ FULL TEXT 
  
  
     share
 share