Statistical Properties of Exclusive and Non-exclusive Online Randomized Experiments using Bucket Reuse
Randomized experiments is a key part of product development in the tech industry. It is often necessary to run programs of exclusive experiments, i.e., experiments that cannot be run on the same units during the same time. These programs implies restriction on the random sampling, as units that are currently in an experiment cannot be sampled into a new one. Moreover, to technically enable this type of coordination with large populations, the units in the population are often grouped into 'buckets' and sampling is then performed on the bucket level. This paper investigates some statistical implications of both the restricted sampling and the bucket-level sampling. The contribution of this paper is threefold: First, bucket sampling is connected to the existing literature on randomized experiments in complex sampling designs which enables establishing properties of the difference-in-means estimator of the average treatment effect. These properties are needed for inference to the population under random sampling of buckets. Second, the bias introduced by restricting the sampling as imposed by programs of exclusive experiments, is derived. Finally, simulation results supporting the theoretical findings are presented together with recommendations on how to empirically evaluate and handle this bias.
READ FULL TEXT