b'Yikang Shen'

research

∙ 06/29/2023

An Efficient General-Purpose Modular Vision Model via Multi-Task Heterogeneous Training

We present a model that can perform multiple vision tasks and can be ada...

0 Zitian Chen, et al. ∙

research

∙ 06/07/2023

ModuleFormer: Learning Modular Large Language Models From Uncurated Data

Large Language Models (LLMs) have achieved remarkable results. But exist...

0 Yikang Shen, et al. ∙

research

∙ 05/04/2023

Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision

Recent AI-assistant agents, such as ChatGPT, predominantly rely on super...

0 Zhiqing Sun, et al. ∙

research

∙ 04/17/2023

Hyper-Decision Transformer for Efficient Online Policy Adaptation

Decision Transformers (DT) have demonstrated strong performances in offl...

1 Mengdi Xu, et al. ∙

research

∙ 04/06/2023

Visual Dependency Transformers: Dependency Tree Emerges from Reversed Attention

Humans possess a versatile mechanism for extracting structured represent...

0 Mingyu Ding, et al. ∙

research

∙ 03/09/2023

Planning with Large Language Models for Code Generation

Existing large language model-based code generation pipelines typically ...

0 Shun Zhang, et al. ∙

research

∙ 01/24/2023

Transformer-Patcher: One Mistake worth One Neuron

Large Transformer-based Pretrained Language Models (PLMs) dominate almos...

0 Zeyu Huang, et al. ∙

research

∙ 12/15/2022

Mod-Squad: Designing Mixture of Experts As Modular Multi-Task Learners

Optimization in multi-task learning (MTL) is more challenging than singl...

1 Zitian Chen, et al. ∙

research

∙ 10/11/2022

Mixture of Attention Heads: Selecting Attention Heads Per Token

Mixture-of-Experts (MoE) networks have been proposed as an efficient way...

0 Xiaofeng Zhang, et al. ∙

research

∙ 06/27/2022

Prompting Decision Transformer for Few-Shot Policy Generalization

Humans can leverage prior experience and learn novel tasks from a handfu...

1 Mengdi Xu, et al. ∙

research

∙ 06/08/2022

Syntactic Inductive Biases for Deep Learning Methods

In this thesis, we try to build a connection between the two schools by ...

0 Yikang Shen, et al. ∙

research

∙ 03/19/2021

Learning Task Decomposition with Ordered Memory Policy Network

Many complex real-world tasks are composed of several levels of sub-task...

35 Yuchen Lu, et al. ∙

research

∙ 12/01/2020

StructFormer: Joint Unsupervised Induction of Dependency and Constituency Structure from Masked Language Modeling

There are two major classes of natural language grammars – the dependenc...

2 Yikang Shen, et al. ∙

research

∙ 11/08/2020

Long Range Arena: A Benchmark for Efficient Transformers

Transformers do not scale very well to long sequence lengths largely bec...

5 Yi Tay, et al. ∙

research

∙ 10/09/2020

Recursive Top-Down Production for Sentence Generation with Latent Trees

We model the recursive production property of context-free grammars for ...

0 Shawn Tan, et al. ∙

research

∙ 05/12/2020

Exploiting Syntactic Structure for Better Language Modeling: A Syntactic Distance Approach

It is commonly believed that knowledge of syntactic structure should imp...

0 Wenyu Du, et al. ∙

research

∙ 10/29/2019

Ordered Memory

Stack-augmented recurrent neural networks (RNNs) have been of interest t...

0 Yikang Shen, et al. ∙

research

∙ 06/23/2019

Investigating Biases in Textual Entailment Datasets

The ability to understand logical relationships between sentences is an ...

0 Shawn Tan, et al. ∙

research

∙ 10/22/2018

Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks

Recurrent neural network (RNN) models are widely used for processing seq...

0 Yikang Shen, et al. ∙

research

∙ 09/25/2018

BanditSum: Extractive Summarization as a Contextual Bandit

In this work, we propose a novel method for training neural networks to ...

0 Yue Dong, et al. ∙

research

∙ 06/11/2018

Straight to the Tree: Constituency Parsing with Neural Syntactic Distance

In this work, we propose a novel constituency parsing scheme. The model ...

2 Yikang Shen, et al. ∙

research

∙ 03/07/2018

Generating Contradictory, Neutral, and Entailing Sentences

Learning distributed sentence representations remains an interesting pro...

0 Yikang Shen, et al. ∙

research

∙ 11/02/2017

Neural Language Modeling by Jointly Learning Syntax and Lexicon

We propose a neural language model capable of unsupervised syntactic str...

0 Yikang Shen, et al. ∙

research

∙ 07/26/2017

Self-organized Hierarchical Softmax

We propose a new self-organizing hierarchical softmax formulation for ne...

0 Yikang Shen, et al. ∙

research

∙ 11/15/2015

Word Embedding based Correlation Model for Question/Answer Matching

With the development of community based question answering (Q&A) service...

0 Yikang Shen, et al. ∙

Yikang Shen

Featured Co-authors

Sign in with Google

Consider DeepAI Pro