Transformers have become the dominant model in deep learning, but the re...
Recent architectural developments have enabled recurrent neural networks...
Online learning holds the promise of enabling efficient long-term credit...
Neural networks trained with stochastic gradient descent (SGD) starting ...
Equilibrium systems are a powerful way to express neural computations. A...
This paper reviews gradient-based techniques to solve bilevel optimizati...
Finding neural network weights that generalize well from small datasets ...
Meta-learning algorithms leverage regularities that are present on a set...