Over-parameterization Improves Generalization in the XOR Detection Problem

10/06/2018

∙

Empirical evidence suggests that neural networks with ReLU activations generalize better with over-parameterization. However, there is currently no theoretical analysis that explains this observation. In this work, we study a simplified learning task with over-parameterized convolutional networks that empirically exhibits the same qualitative phenomenon. For this setting, we provide a theoretical analysis of the optimization and generalization performance of gradient descent. Specifically, we prove data-dependent sample complexity bounds which show that over-parameterization improves the generalization performance of gradient descent.

READ FULL TEXT

Over-parameterization Improves Generalization in the XOR Detection Problem

Sign in with Google

Consider DeepAI Pro