Hybrid Transfer in Deep Reinforcement Learning for Ads Allocation

04/02/2022
by   Guogang Liao, et al.
0

Ads allocation, that allocates ads and organic items to limited slots in feed with the purpose of maximizing platform revenue, has become a popular problem. However, e-commerce platforms usually have multiple entrances for different categories and some entrances have few visits. Data accumulated on these entrances can hardly support the learning of a good agent. To address this challenge, we present Similarity-based Hybrid Transfer for Ads Allocation (SHTAA), which can effectively transfer the samples as well as the knowledge from data-rich entrance to other data-poor entrance. Specifically, we define an uncertainty-aware Markov Decision Process (MDP) similarity which can estimate the MDP similarity of different entrances. Based on the MDP similarity, we design a hybrid transfer method (consisting of instance transfer and strategy transfer) to efficiently transfer the samples and the knowledge from one entrance to another. Both offline and online experiments on Meituan food delivery platform demonstrate that our method can help to learn better agent for data-poor entrance and increase the revenue for the platform.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset