We focus on the problem of learning without forgetting from multiple tas...
Stochastic gradient descent plays a fundamental role in nearly all
appli...
Transformers have become the state-of-the-art neural network architectur...
Decentralized learning with private data is a central problem in machine...
In this paper we propose augmenting Vision Transformer models with learn...
In this work we propose a HyperTransformer, a transformer-based model fo...
Conditional computation and modular networks have been recently proposed...
In this work, we present BasisNet which combines recent advancements in
...
In this paper, we introduce a new type of generalized neural network whe...
Knowledge distillation is one of the most popular and effective techniqu...
In this paper, we propose a new approach for building cellular automata ...
We explore the question of how the resolution of the input image ("input...
We propose a new method for learning image attention masks in a
semi-sup...
We introduce a novel method that enables parameter-efficient transfer an...
In this paper we describe a new mobile architecture, MobileNetV2, that
i...
In this paper we describe a new mobile architecture, MobileNetV2, that
i...
CycleGAN is one of the latest successful approaches to learn a correspon...
Deep convolutional networks are well-known for their high computational ...
Deep neural networks have dramatically advanced the state of the art for...