A first-order optimization algorithm for statistical learning with hierarchical sparsity structure
In many statistical learning problems, it is desired that the optimal solution conforms to an a priori known sparsity structure e.g. for better interpretability. Inducing such structures by means of convex regularizers requires nonsmooth penalty functions that exploit group overlapping. Our study focuses on evaluating the proximal operator of the Latent Overlapping Group lasso developed by Jacob et al. (2009). We develop an Alternating Direction Method of Multiplier with a sharing scheme to solve large-scale instance of the underlying optimization problem efficiently. In the absence of strong convexity, linear convergence of the algorithm is established using the error bound theory. More specifically, the paper contributes to establishing primal and dual error bounds over an unbounded feasible set and when the nonsmooth component in the objective function does not have a polyhedral epigraph. Numerical simulation studies supporting the proposed algorithm and two learning applications are discussed.
READ FULL TEXT