The prevalence and high capacity of large language models (LLMs) present...
We present symbol tuning - finetuning language models on in-context
inpu...
We present a method to formulate algorithm discovery as program search, ...
Recently, Sharpness-Aware Minimization (SAM), which connects the geometr...
Several recent studies have demonstrated that attention-based networks, ...
Predictor-based algorithms have achieved remarkable performance in the N...
Differentiable Neural Architecture Search is one of the most popular Neu...
Vision Transformers (ViTs) and MLPs signal further efforts on replacing
...
Large-batch training has become a commonly used technique when training
...
Data augmentation has become a de facto component for training
high-perf...
This paper proposes a novel differentiable architecture search method by...
Differentiable architecture search (DARTS) is a prevailing NAS solution ...
Despite significant progress in dissecting the genetic architecture of
c...
Interaction function (IFC), which captures interactions among items and
...
Most existing recommender systems leverage user behavior data of one typ...
Most existing recommender systems leverage the data of one type of user
...