Coresets for Wasserstein Distributionally Robust Optimization Problems
Wasserstein distributionally robust optimization () is a popular model to enhance the robustness of machine learning with ambiguous data. However, the complexity of can be prohibitive in practice since solving its “minimax” formulation requires a great amount of computation. Recently, several fast training algorithms for some specific machine learning tasks (e.g., logistic regression) have been developed. However, the research on designing efficient algorithms for general large-scale s is still quite limited, to the best of our knowledge. Coreset is an important tool for compressing large dataset, and thus it has been widely applied to reduce the computational complexities for many optimization problems. In this paper, we introduce a unified framework to construct the ϵ-coreset for the general problems. Though it is challenging to obtain a conventional coreset for due to the uncertainty issue of ambiguous data, we show that we can compute a “dual coreset” by using the strong duality property of . Also, the error introduced by the dual coreset can be theoretically guaranteed for the original objective. To construct the dual coreset, we propose a novel grid sampling approach that is particularly suitable for the dual formulation of . Finally, we implement our coreset approach and illustrate its effectiveness for several problems in the experiments.
READ FULL TEXT