Reducing the computational cost of running large scale neural networks u...
We present unit scaling, a paradigm for designing deep learning models t...
We present the award-winning submission to the WikiKG90Mv2 track of
OGB-...
Identifying algorithms for computational efficient unsupervised training...
Attention based language models have become a critical component in
stat...
Differential privacy has gained popularity in machine learning as a stro...