Optimization is ubiquitous. While derivative-based algorithms have been
...
Sycophancy is an undesirable behavior where models tailor their response...
Transformers are central to recent successes in natural language process...
The mixture proportions of pretraining data domains (e.g., Wikipedia, bo...
We present symbol tuning - finetuning language models on in-context
inpu...
We study how in-context learning (ICL) in language models is affected by...
We present a method to formulate algorithm discovery as program search, ...
The increasing complexity and scale of machine learning (ML) has led to ...
The best neural architecture for a given machine learning problem depend...
Neural networks are sensitive to hyper-parameter and architecture choice...
We present Meena, a multi-turn open-domain chatbot trained end-to-end on...
Building effective neural networks requires many design choices. These
i...