On Distributed Exact Sparse Linear Regression over Networks
In this work, we propose an algorithm for solving exact sparse linear regression problems over a network in a distributed manner. Particularly, we consider the problem where data is stored among different computers or agents that seek to collaboratively find a common regressor with a specified sparsity k, i.e., the L0-norm is less than or equal to k. Contrary to existing literature that uses L1 regularization to approximate sparseness, we solve the problem with exact sparsity k. The main novelty in our proposal lies in showing a problem formulation with zero duality gap for which we adopt a dual approach to solve the problem in a decentralized way. This sets a foundational approach for the study of distributed optimization with explicit sparsity constraints. We show theoretically and empirically that, under appropriate assumptions, where each agent solves smaller and local integer programming problems, all agents will eventually reach a consensus on the same sparse optimal regressor.
READ FULL TEXT