We present a stochastic optimization method that uses a fourth-order
reg...
The choice of how to retain information about past gradients dramaticall...
We investigate the use of ellipsoidal trust region constraints for
secon...
Normalization techniques such as Batch Normalization have been applied v...
We analyze the variance of stochastic gradients along negative curvature...