Multivariate Convex Regression at Scale

05/23/2020
by   Wenyu Chen, et al.
0

We present new large-scale algorithms for fitting a multivariate convex regression function to n samples in d dimensions—a key problem in shape constrained nonparametric regression with widespread applications in engineering and the applied sciences. The infinite-dimensional learning task can be expressed via a convex quadratic program (QP) with O(nd) decision variables and O(n^2) constraints. While instances with n in the lower thousands can be addressed with current algorithms within reasonable runtimes, solving larger problems (e.g., n≈ 10^4 or 10^5) are computationally challenging. To this end, we present an active set type algorithm on the Lagrangian dual (of a perturbation) of the primal QP. For computational scalability, we perform approximate optimization of the reduced sub-problems; and propose a variety of randomized augmentation rules for expanding the active set. Although the dual is not strongly convex, we present a novel linear convergence rate of our algorithm on the dual. We demonstrate that our framework can solve instances of the convex regression problem with n=10^5 and d=10—a QP with 10 billion variables—within minutes; and offers significant computational gains (e.g., in terms of memory and runtime) compared to current algorithms.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset