Mixture model for designs in high dimensional regression and the LASSO

10/17/2012
by   Stéphane Chrétien, et al.
0

The LASSO is a recent technique for variable selection in the regression model y & = & Xβ +ϵ, where X∈^n× p and ϵ is a centered gaussian i.i.d. noise vector N(0,σ^2I). The LASSO has been proved to perform exact support recovery for regression vectors when the design matrix satisfies certain algebraic conditions and β is sufficiently sparse. Estimation of the vector Xβ has also extensively been studied for the purpose of prediction under the same algebraic conditions on X and under sufficient sparsity of β. Among many other, the coherence is an index which can be used to study these nice properties of the LASSO. More precisely, a small coherence implies that most sparse vectors, with less nonzero components than the order n/(p), can be recovered with high probability if its nonzero components are larger than the order σ√((p)). However, many matrices occuring in practice do not have a small coherence and thus, most results which have appeared in the litterature cannot be applied. The goal of this paper is to study a model for which precise results can be obtained. In the proposed model, the columns of the design matrix are drawn from a Gaussian mixture model and the coherence condition is imposed on the much smaller matrix whose columns are the mixture's centers, instead of on X itself. Our main theorem states that Xβ is as well estimated as in the case of small coherence up to a correction parametrized by the maximal variance in the mixture model.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset