Image Matters: Jointly Train Advertising CTR Model with Image Representation of Ad and User Behavior
Click Through Rate(CTR) prediction is vital for online advertising system. Recently sparse ID features are widely adopted in the industry. While the ID features, e.g. the serial number of ad, are of low computation complexity and cheap to acquire, they can reveal little intrinsic information about the ad itself. In this work, we propose a novel Deep Image CTR Model(DICM). DICM i) introduces image content features to exploit the intrinsic description of ad/goods and shows effectiveness on complete dataset in accordance with the product environment of a real commercial advertising system; ii) not only represents ad with image features, but also, for the first time, jointly models the user behaviors and ads with image features to capture user interests. However, the users historical behaviors involve massive images for industry scale training(e.g. 2.4 million images per mini-batch with the batchsize of 60k), which brings great challenges on both computation and storage. To tackle the challenges, we carefully design a highly efficient distributed system which enables daily-updated model training on billions of samples essential for product deployment. Extensive experiments show that the image features can be effective representations as well as good complements to the corresponding ID features.
READ FULL TEXT