b'Hanxiao Liu'

research

∙ 09/07/2023

Large Language Models as Optimizers

Optimization is ubiquitous. While derivative-based algorithms have been ...

0 Chengrun Yang, et al. ∙

research

∙ 05/17/2023

DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining

The mixture proportions of pretraining data domains (e.g., Wikipedia, bo...

0 Sang Michael Xie, et al. ∙

research

∙ 05/17/2023

IDO-VFI: Identifying Dynamics via Optical Flow Guidance for Video Frame Interpolation with Events

Video frame interpolation aims to generate high-quality intermediate fra...

0 Chenyang Shi, et al. ∙

research

∙ 03/07/2023

Larger language models do in-context learning differently

We study how in-context learning (ICL) in language models is affected by...

0 Jerry Wei, et al. ∙

research

∙ 04/15/2022

Resource-Constrained Neural Architecture Search on Tabular Datasets

The best neural architecture for a given machine learning problem depend...

0 Chengrun Yang, et al. ∙

research

∙ 02/21/2022

Transformer Quality in Linear Time

We revisit the design choices in Transformers, and propose methods to ad...

0 Weizhe Hua, et al. ∙

research

∙ 02/18/2022

Mixture-of-Experts with Expert Choice Routing

Sparsely-activated Mixture-of-experts (MoE) models allow the number of p...

0 Yanqi Zhou, et al. ∙

research

∙ 11/19/2021

Combined Scaling for Zero-shot Transfer Learning

We present a combined scaling method called BASIC that achieves 85.7 zer...

0 Hieu Pham, et al. ∙

research

∙ 09/17/2021

Primer: Searching for Efficient Transformers for Language Modeling

Large Transformer models have been central to recent advances in natural...

0 David R. So, et al. ∙

research

∙ 06/09/2021

CoAtNet: Marrying Convolution and Attention for All Data Sizes

Transformers have attracted increasing interests in computer vision, but...

0 Zihang Dai, et al. ∙

research

∙ 05/17/2021

Pay Attention to MLPs

Transformers have become one of the most important architectural innovat...

39 Hanxiao Liu, et al. ∙

research

∙ 01/21/2021

PyGlove: Symbolic Programming for Automated Machine Learning

Neural networks are sensitive to hyper-parameter and architecture choice...

30 Daiyi Peng, et al. ∙

research

∙ 10/21/2020

Transferable Graph Optimizers for ML Compilers

Most compilers for machine learning (ML) frameworks need to solve many c...

0 Yanqi Zhou, et al. ∙

research

∙ 08/18/2020

Discovering Multi-Hardware Mobile Models via Architecture Search

Developing efficient models for mobile phones or other on-device deploym...

0 Grace Chu, et al. ∙

research

∙ 08/13/2020

Can weight sharing outperform random architecture search? An investigation with TuNAS

Efficient Neural Architecture Search methods based on weight sharing hav...

15 Gabriel Bender, et al. ∙

research

∙ 06/11/2020

Rethinking Pre-training and Self-training

Pre-training is a dominant paradigm in computer vision. For example, sup...

0 Barret Zoph, et al. ∙

research

∙ 04/30/2020

MobileDets: Searching for Object Detection Architectures for Mobile Accelerators

Inverted bottleneck layers, which are built upon depthwise convolutions,...

15 Yunyang Xiong, et al. ∙

research

∙ 04/06/2020

Evolving Normalization-Activation Layers

Normalization layers and activation functions are critical components in...

9 Hanxiao Liu, et al. ∙

research

∙ 03/24/2020

BigNAS: Scaling Up Neural Architecture Search with Big Single-Stage Models

Neural architecture search (NAS) has shown promising results discovering...

6 Jiahui Yu, et al. ∙

research

∙ 12/02/2019

MnasFPN: Learning Latency-aware Pyramid Architecture for Object Detection on Mobile Devices

Despite the blooming success of architecture search for vision tasks in ...

0 Bo Chen, et al. ∙

research

∙ 12/02/2019

Neural Predictor for Neural Architecture Search

Neural Architecture Search methods are effective but often use complex a...

0 Wei Wen, et al. ∙

research

∙ 09/28/2019

GDP: Generalized Device Placement for Dataflow Graphs

Runtime and scalability of large neural networks can be significantly af...

0 Yanqi Zhou, et al. ∙

research

∙ 06/24/2018

DARTS: Differentiable Architecture Search

This paper addresses the scalability challenge of architecture search by...

0 Hanxiao Liu, et al. ∙

research

∙ 11/01/2017

Hierarchical Representations for Efficient Architecture Search

We explore efficient neural architecture search methods and present a si...

0 Hanxiao Liu, et al. ∙

research

∙ 10/31/2017

Learning Graph Convolution Filters from Data Manifold

Convolution Neural Network (CNN) has gained tremendous success in comput...

0 Guokun Lai, et al. ∙

research

∙ 05/06/2017

Analogical Inference for Multi-Relational Embeddings

Large-scale multi-relational embedding refers to the task of learning th...

0 Hanxiao Liu, et al. ∙

research

∙ 04/15/2017

RACE: Large-scale ReAding Comprehension Dataset From Examinations

We present RACE, a new dataset for benchmark evaluation of methods in th...

0 Guokun Lai, et al. ∙

research

∙ 03/02/2017

A Comparative Study of Word Embeddings for Reading Comprehension

The focus of past machine learning research for Reading Comprehension ta...

0 Bhuwan Dhingra, et al. ∙

research

∙ 06/05/2016

Gated-Attention Readers for Text Comprehension

In this paper we study the problem of answering cloze-style questions ov...

0 Bhuwan Dhingra, et al. ∙

Hanxiao Liu

Featured Co-authors

Sign in with Google

Consider DeepAI Pro