Generalizing trial findings using nested trial designs with sub-sampling of non-randomized individuals
To generalize inferences from a randomized trial to the target population of all trial-eligible individuals, investigators can use nested trial designs, where the randomized individuals are nested within a cohort of trial-eligible individuals, including those who are not offered or refuse randomization. In these designs, data on baseline covariates are collected from the entire cohort, and treatment and outcome data need only be collected from randomized individuals. In this paper, we describe nested trial designs that improve research economy by collecting additional baseline covariate data after sub-sampling non-randomized individuals (i.e., a two-stage design), using sampling probabilities that may depend on the initial set of baseline covariates available from all individuals in the cohort. We propose an estimator for the potential outcome mean in the target population of all trial-eligible individuals and show that our estimator is doubly robust, in the sense that it is consistent when either the model for the conditional outcome mean among randomized individuals or the model for the probability of trial participation is correctly specified. We assess the impact of sub-sampling on the asymptotic variance of our estimator and examine the estimator's finite-sample performance in a simulation study. We illustrate the methods using data from the Coronary Artery Surgery Study (CASS).
READ FULL TEXT