Olatunji Ruwase | DeepAI

Chat Image Generator Video Music Voice Chat Photo Editor

Featured Co-authors

Dong Li
125 publications
Cheng Li
122 publications
Lei Yang
117 publications
Jie Ren
66 publications
Zhewei Yao
41 publications
Yuxiong He
34 publications
Feng Yan
28 publications
Minjia Zhang
24 publications
Shuaiwen Leon Song
22 publications
Antonio Gonzalez
20 publications
Siddharth Singh
19 publications

research

∙ 08/02/2023

DeepSpeed-Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales

ChatGPT-like models have revolutionized various applications in artifici...

0 Zhewei Yao, et al. ∙

research

∙ 06/16/2023

ZeRO++: Extremely Efficient Collective Communication for Giant Model Training

Zero Redundancy Optimizer (ZeRO) has been used to train a wide range of ...

0 Guanhua Wang, et al. ∙

research

∙ 03/11/2023

A Hybrid Tensor-Expert-Data Parallelism Approach to Optimize Mixture-of-Experts Training

Mixture-of-Experts (MoE) is a neural network architecture that adds spar...

0 Siddharth Singh, et al. ∙

research

∙ 06/30/2022

DeepSpeed Inference: Enabling Efficient Inference of Transformer Models at Unprecedented Scale

The past several years have witnessed the success of transformer-based m...

6 Reza Yazdani Aminabadi, et al. ∙

research

∙ 04/16/2021

ZeRO-Infinity: Breaking the GPU Memory Wall for Extreme Scale Deep Learning

In the last three years, the largest dense deep learning models have gro...

68 Samyam Rajbhandari, et al. ∙

research

∙ 01/18/2021

ZeRO-Offload: Democratizing Billion-Scale Model Training

Large-scale model training has been a playing ground for a limited few r...

0 Jie Ren, et al. ∙

research

∙ 11/04/2019

LSTM-Sharp: An Adaptable, Energy-Efficient Hardware Accelerator for Long Short-Term Memory

The effectiveness of LSTM neural networks for popular tasks such as Auto...

76 Reza Yazdani, et al. ∙

research

∙ 10/04/2019

ZeRO: Memory Optimization Towards Training A Trillion Parameter Models

Training large DL models with billions and potentially trillions of para...

1 Samyam Rajbhandari, et al. ∙

Success!

An error occurred