Propensity Score Methods for Merging Observational and Experimental Datasets
We consider merging information from a randomized controlled trial (RCT) into a much larger observational database (ODB), for the purpose of estimating a treatment effect. In our motivating setting, the ODB has better representativeness (external validity) while the RCT has genuine randomization. We work with strata defined by propensity score in the ODB. For all subjects in the RCT, we find the propensity score they would have had, had they been in the ODB. We develop and simulate two hybrid methods. The first method simply spikes the RCT data into their corresponding ODB stratum. The second method takes a data driven convex combination of the ODB and RCT treatment effect estimates within each stratum. We develop delta method estimates of the bias and variance of these methods and we simulate them. The spike-in method works best when the RCT covariates are drawn from the same distribution as in the ODB. When the RCT inclusion criteria are very different than those of the ODB, then the spike-in estimate can be severely biased and the second, dynamic method works better.
READ FULL TEXT