William Fedus

research

∙ 05/24/2023

Flan-MoE: Scaling Instruction-Finetuned Language Models with Sparse Mixture of Experts

The explosive growth of language models and their applications have led ...

0 Sheng Shen, et al. ∙

research

∙ 09/04/2022

A Review of Sparse Expert Models in Deep Learning

Sparse expert models are a thirty-year old concept re-emerging as a popu...

42 William Fedus, et al. ∙

research

∙ 07/21/2022

Scaling Laws vs Model Architectures: How does Inductive Bias Influence Scaling?

There have been a lot of interest in the scaling properties of Transform...

0 Yi Tay, et al. ∙

research

∙ 02/17/2022

Designing Effective Sparse Expert Models

Scale has opened new frontiers in natural language processing – but at a...

0 Barret Zoph, et al. ∙

research

∙ 09/22/2021

On Bonus-Based Exploration Methods in the Arcade Learning Environment

Research on exploration in reinforcement learning, as applied to Atari 2...

0 Adrien Ali Taïga, et al. ∙

research

∙ 09/22/2021

Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers

There remain many open questions pertaining to the scaling behaviour of ...

3 Yi Tay, et al. ∙

research

∙ 03/13/2021

Revisiting ResNets: Improved Training and Scaling Strategies

Novel computer vision architectures monopolize the spotlight, but the im...

0 Irwan Bello, et al. ∙

research

∙ 02/23/2021

Do Transformer Modifications Transfer Across Implementations and Applications?

The research community has proposed copious modifications to the Transfo...

10 Sharan Narang, et al. ∙

research

∙ 01/11/2021

Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity

In deep learning, models typically reuse the same parameters for all inp...

0 William Fedus, et al. ∙

research

∙ 07/13/2020

Revisiting Fundamentals of Experience Replay

Experience replay is central to off-policy algorithms in deep reinforcem...

5 William Fedus, et al. ∙

research

∙ 02/28/2020

On Catastrophic Interference in Atari 2600 Games

Model-free deep reinforcement learning algorithms are troubled with poor...

8 William Fedus, et al. ∙

research

∙ 11/28/2019

Algorithmic Improvements for Deep Reinforcement Learning applied to Interactive Fiction

Text-based games are a natural challenge domain for deep reinforcement l...

3 Vishal Jain, et al. ∙

research

∙ 08/06/2019

Benchmarking Bonus-Based Exploration Methods on the Arcade Learning Environment

This paper provides an empirical evaluation of recently developed explor...

2 Adrien Ali Taïga, et al. ∙

research

∙ 02/19/2019

Hyperbolic Discounting and Learning over Multiple Horizons

Reinforcement learning (RL) typically defines a discount factor as part ...

4 William Fedus, et al. ∙

research

∙ 11/06/2018

Language GANs Falling Short

Generating high-quality text with sufficient diversity is essential for ...

1 Massimo Caccia, et al. ∙

research

∙ 09/27/2018

Deep Graph Infomax

We present Deep Graph Infomax (DGI), a general approach for learning nod...

4 Petar Veličković, et al. ∙

research

∙ 04/02/2018

Recall Traces: Backtracking Models for Efficient Reinforcement Learning

In many environments only a tiny subset of all states yield high reward....

0 Anirudh Goyal, et al. ∙

research

∙ 02/26/2018

Disentangling the independently controllable factors of variation by interacting with the world

It has been postulated that a good representation is one that disentangl...

0 Valentin Thomas, et al. ∙

research

∙ 01/23/2018

MaskGAN: Better Text Generation via Filling in the ______

Neural text generation models are often autoregressive language models o...

0 William Fedus, et al. ∙

research

∙ 10/23/2017

Many Paths to Equilibrium: GANs Do Not Need to Decrease a Divergence At Every Step

Generative adversarial networks (GANs) are a family of generative models...

0 William Fedus, et al. ∙

William Fedus

Featured Co-authors

Sign in with Google

Consider DeepAI Pro