Neural Network Architecture Beyond Width and Depth

05/19/2022
by   Zuowei Shen, et al.
0

This paper proposes a new neural network architecture by introducing an additional dimension called height beyond width and depth. Neural network architectures with height, width, and depth as hyperparameters are called three-dimensional architectures. It is shown that neural networks with three-dimensional architectures are significantly more expressive than the ones with two-dimensional architectures (those with only width and depth as hyperparameters), e.g., standard fully connected networks. The new network architecture is constructed recursively via a nested structure, and hence we call a network with the new architecture nested network (NestNet). A NestNet of height s is built with each hidden neuron activated by a NestNet of height ≤ s-1. When s=1, a NestNet degenerates to a standard network with a two-dimensional architecture. It is proved by construction that height-s ReLU NestNets with 𝒪(n) parameters can approximate Lipschitz continuous functions on [0,1]^d with an error 𝒪(n^-(s+1)/d), while the optimal approximation error of standard ReLU networks with 𝒪(n) parameters is 𝒪(n^-2/d). Furthermore, such a result is extended to generic continuous functions on [0,1]^d with the approximation error characterized by the modulus of continuity. Finally, a numerical example is provided to explore the advantages of the super approximation power of ReLU NestNets.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset