DeepAI

AI Chat AI Image Generator AI Video AI Music Generator

Depth with Nonlinearity Creates No Bad Local Minima in ResNets

10/21/2018

∙

by Kenji Kawaguchi, et al.

∙

∙

In this paper, we prove that depth with nonlinearity creates no bad local minima in a type of arbitrarily deep ResNets studied in previous work, in the sense that the values of all local minima are no worse than the global minima values of corresponding shallow linear predictors with arbitrary fixed features, and are guaranteed to further improve via residual representations. As a result, this paper provides an affirmative answer to an open question stated in a paper in the conference on Neural Information Processing Systems (NIPS) 2018. We note that even though our paper advances the theoretical foundation of deep learning and non-convex optimization, there is still a gap between theory and many practical deep learning applications.

Kenji Kawaguchi
62 publications
Yoshua Bengio
448 publications

page 1

page 2

page 3

page 4

research

∙ 02/27/2017

Depth Creates No Bad Local Minima

In deep learning, depth, as well as nonlinearity, create non-convex loss...

0 Haihao Lu, et al. ∙

research

∙ 05/23/2016

Deep Learning without Poor Local Minima

In this paper, we prove a conjecture published in 1989 and also partiall...

0 Kenji Kawaguchi, et al. ∙

research

∙ 01/28/2019

Depth creates no more spurious local minima

We show that for any convex differentiable loss function, a deep linear ...

0 Li Zhang, et al. ∙

research

∙ 11/20/2018

Effect of Depth and Width on Local Minima in Deep Learning

In this paper, we analyze the effects of depth and width on the quality ...

0 Kenji Kawaguchi, et al. ∙

research

∙ 07/09/2020

Maximum-and-Concatenation Networks

While successful in many fields, deep neural networks (DNNs) still suffe...

0 Xingyu Xie, et al. ∙

research

∙ 07/09/2019

Are deep ResNets provably better than linear predictors?

Recently, a residual network (ResNet) with a single residual block has b...

2 Chulhee Yun, et al. ∙

research

∙ 05/10/2019

The sharp, the flat and the shallow: Can weakly interacting agents learn to escape bad minima?

An open problem in machine learning is whether flat minima generalize be...

0 Nikolas Kantas, et al. ∙