Multivariate Convex Regression at Scale

05/23/2020

∙

We present new large-scale algorithms for fitting a multivariate convex regression function to n samples in d dimensions—a key problem in shape constrained nonparametric regression with widespread applications in engineering and the applied sciences. The infinite-dimensional learning task can be expressed via a convex quadratic program (QP) with O(nd) decision variables and O(n^2) constraints. While instances with n in the lower thousands can be addressed with current algorithms within reasonable runtimes, solving larger problems (e.g., n≈ 10^4 or 10^5) are computationally challenging. To this end, we present an active set type algorithm on the Lagrangian dual (of a perturbation) of the primal QP. For computational scalability, we perform approximate optimization of the reduced sub-problems; and propose a variety of randomized augmentation rules for expanding the active set. Although the dual is not strongly convex, we present a novel linear convergence rate of our algorithm on the dual. We demonstrate that our framework can solve instances of the convex regression problem with n=10^5 and d=10—a QP with 10 billion variables—within minutes; and offers significant computational gains (e.g., in terms of memory and runtime) compared to current algorithms.

READ FULL TEXT

Multivariate Convex Regression at Scale

Sign in with Google

Consider DeepAI Pro