Double/Debiased Machine Learning for Logistic Partially Linear Model
We propose double/debiased machine learning approaches to infer (at the parametric rate) the parametric component of a logistic partially linear model with the binary response following a conditional logistic model of a low dimensional linear parametric function of some key (exposure) covariates and a nonparametric function adjusting for the confounding effect of other covariates. We consider a Neyman orthogonal (doubly robust) score equation consisting of two nuisance functions: nonparametric component in the logistic model and conditional mean of the exposure on the other covariates and with the response fixed. To estimate the nuisance models, we separately consider the use of high dimensional (HD) sparse parametric models and more general (typically nonparametric) machine learning (ML) methods. In the HD case, we derive certain moment equations to calibrate the first-order bias of the nuisance models and grant our method a model double robustness property in the sense that our estimator achieves the desirable rate when at least one of the nuisance models is correctly specified and both of them are ultra-sparse. In the ML case, the non-linearity of the logit link makes it substantially harder than the partially linear setting to use an arbitrary conditional mean learning algorithm to estimate the nuisance component of the logistic model. We handle this obstacle through a novel full model refitting procedure that is easy-to-implement and facilitates the use of nonparametric ML algorithms in our framework. Our ML estimator is rate doubly robust in the same sense as Chernozhukov et al. (2018a). We evaluate our methods through simulation studies and apply them in assessing the effect of emergency contraceptive (EC) pill on early gestation foetal with a policy reform in Chile in 2008 (Bentancor and Clarke, 2017).
READ FULL TEXT