We investigate the optimal model size and number of tokens for training ...
We enhance auto-regressive language models by conditioning on document c...
Sparse neural networks are becoming increasingly important as the field ...
Deep attention models have advanced the modelling of sequential data acr...
We present the Compressive Transformer, an attentive sequence model whic...
Owing to their ability to both effectively integrate information over lo...
We study the problem of learning associative memory – a system which is ...
Some of the most successful applications of deep reinforcement learning ...
There has been a recent trend in training neural networks to replace dat...
Neural networks trained with backpropagation often struggle to identify
...
Deep neural networks have excelled on a wide range of problems, from vis...
Neural networks augmented with external memory have the ability to learn...