Hyperparameter (HP) tuning in deep learning is an expensive process,
pro...
We study the bit complexity of two related fundamental computational pro...
We introduce Codex, a GPT language model fine-tuned on publicly availabl...
We identify empirical scaling laws for the cross-entropy loss in four
do...
Recent work has demonstrated substantial gains on many NLP tasks and
ben...
Random projections (RP) are a popular tool for reducing dimensionality w...
This is the second in a series of papers on rank decompositions of the m...
The Generalized Lax Conjecture asks whether every hyperbolicity cone is ...