Partial Counterfactual Identification from Observational and Experimental Data
This paper investigates the problem of bounding counterfactual queries from an arbitrary collection of observational and experimental distributions and qualitative knowledge about the underlying data-generating model represented in the form of a causal diagram. We show that all counterfactual distributions in an arbitrary structural causal model (SCM) could be generated by a canonical family of SCMs with the same causal diagram where unobserved (exogenous) variables are discrete with a finite domain. Utilizing the canonical SCMs, we translate the problem of bounding counterfactuals into that of polynomial programming whose solution provides optimal bounds for the counterfactual query. Solving such polynomial programs is in general computationally expensive. We therefore develop effective Monte Carlo algorithms to approximate the optimal bounds from an arbitrary combination of observational and experimental data. Our algorithms are validated extensively on synthetic and real-world datasets.
READ FULL TEXT