Most Neural Networks Are Almost Learnable
We present a PTAS for learning random constant-depth networks. We show that for any fixed ϵ>0 and depth i, there is a poly-time algorithm that for any distribution on √(d)·𝕊^d-1 learns random Xavier networks of depth i, up to an additive error of ϵ. The algorithm runs in time and sample complexity of (d̅)^poly(ϵ^-1), where d̅ is the size of the network. For some cases of sigmoid and ReLU-like activations the bound can be improved to (d̅)^polylog(ϵ^-1), resulting in a quasi-poly-time algorithm for learning constant depth random networks.
READ FULL TEXT