Learning List-wise Representation in Reinforcement Learning for Ads Allocation with Multiple Auxiliary Tasks
With the recent prevalence of reinforcement learning (RL), there have been tremendous interests in utilizing RL for ads allocation in recommendation platforms (e.g., e-commerce and news feed sites). For better performance, recent RL-based ads allocation agent makes decisions based on representations of list-wise item arrangement. This results in a high-dimensional state-action space, which makes it difficult to learn an efficient and generalizable list-wise representation. To address this problem, we propose a novel algorithm to learn a better representation by leveraging task-specific signals on Meituan food delivery platform. Specifically, we propose three different types of auxiliary tasks that are based on reconstruction, prediction, and contrastive learning respectively. We conduct extensive offline experiments on the effectiveness of these auxiliary tasks and test our method on real-world food delivery platform. The experimental results show that our method can learn better list-wise representations and achieve higher revenue for the platform.
READ FULL TEXT