b'Yeyun Gong'

research

∙ 07/15/2023

Think-on-Graph: Deep and Responsible Reasoning of Large Language Model with Knowledge Graph

Large language models (LLMs) have made significant strides in various ta...

0 Jiashuo Sun, et al. ∙

research

∙ 06/15/2023

CMMLU: Measuring massive multitask language understanding in Chinese

As the capabilities of large language models (LLMs) continue to advance,...

0 Haonan Li, et al. ∙

research

∙ 05/24/2023

Enhancing Retrieval-Augmented Large Language Models with Iterative Retrieval-Generation Synergy

Large language models are powerful text processors and reasoners, but ar...

0 Zhihong Shao, et al. ∙

research

∙ 05/24/2023

BeamSearchQA: Large Language Models are Strong Zero-Shot QA Solver

Open-domain question answering is a crucial task that often requires acc...

0 Hao Sun, et al. ∙

research

∙ 05/23/2023

Query Rewriting for Retrieval-Augmented Large Language Models

Large Language Models (LLMs) play a powerful Reader of the Retrieve-then...

0 Xinbei Ma, et al. ∙

research

∙ 05/16/2023

AR-Diffusion: Auto-Regressive Diffusion Model for Text Generation

Diffusion models have gained significant attention in the realm of image...

1 Tong Wu, et al. ∙

research

∙ 05/11/2023

PROM: A Phrase-level Copying Mechanism with Pre-training for Abstractive Summarization

Based on the remarkable achievements of pre-trained language models in a...

0 Xinbei Ma, et al. ∙

research

∙ 04/23/2023

Enhancing Chain-of-Thoughts Prompting with Iterative Bootstrapping in Large Language Models

Large language models (LLMs) can achieve highly effective performance on...

0 Jiashuo Sun, et al. ∙

research

∙ 03/29/2023

AnnoLLM: Making Large Language Models to Be Better Crowdsourced Annotators

Many natural language processing (NLP) tasks rely on labeled data to tra...

0 Xingwei He, et al. ∙

research

∙ 02/01/2023

Synthetic Prompting: Generating Chain-of-Thought Demonstrations for Large Language Models

Large language models can perform various reasoning tasks by using chain...

0 Zhihong Shao, et al. ∙

research

∙ 12/22/2022

GENIE: Large Scale Pre-training for Text Generation with Diffusion Model

In this paper, we propose a large-scale language pre-training for text G...

1 Zhenghao Lin, et al. ∙

research

∙ 12/18/2022

Curriculum Sampling for Dense Retrieval with Document Expansion

The dual-encoder has become the de facto architecture for dense retrieva...

0 Xingwei He, et al. ∙

research

∙ 12/15/2022

MASTER: Multi-task Pre-trained Bottlenecked Masked Autoencoders are Better Dense Retrievers

Dense retrieval aims to map queries and passages into low-dimensional ve...

0 Kun Zhou, et al. ∙

research

∙ 12/14/2022

APOLLO: An Optimized Training Approach for Long-form Numerical Reasoning

Long-form numerical reasoning in financial analysis aims to generate a r...

0 Jiashuo Sun, et al. ∙

research

∙ 12/10/2022

LEAD: Liberal Feature-based Distillation for Dense Retrieval

Knowledge distillation is often used to transfer knowledge from a strong...

0 Hao Sun, et al. ∙

research

∙ 11/18/2022

GENIUS: Sketch-based Language Model Pre-training via Extreme and Selective Masking for Text Generation and Augmentation

We introduce GENIUS: a conditional text generation model using sketches ...

0 Biyang Guo, et al. ∙

research

∙ 10/21/2022

SimANS: Simple Ambiguous Negatives Sampling for Dense Text Retrieval

Sampling proper negatives from a large document pool is vital to effecti...

0 Kun Zhou, et al. ∙

research

∙ 10/21/2022

Metric-guided Distillation: Distilling Knowledge from the Metric to Ranker and Retriever for Generative Commonsense Reasoning

Commonsense generation aims to generate a realistic sentence describing ...

0 Xingwei He, et al. ∙

research

∙ 10/18/2022

Sentiment-Aware Word and Sentence Level Pre-training for Sentiment Analysis

Most existing pre-trained language representation models (PLMs) are sub-...

9 Shuai Fan, et al. ∙

research

∙ 10/18/2022

Soft-Labeled Contrastive Pre-training for Function-level Code Representation

Code contrastive pre-training has recently achieved significant progress...

0 Xiaonan Li, et al. ∙

research

∙ 09/27/2022

PROD: Progressive Distillation for Dense Retrieval

Knowledge distillation is an effective way to transfer knowledge from a ...

7 Zhenghao Lin, et al. ∙

research

∙ 06/28/2022

Joint Generator-Ranker Learning for Natural Language Generation

Due to exposure bias, most existing natural language generation (NLG) mo...

0 Weizhou Shen, et al. ∙

research

∙ 05/23/2022

A Self-Paced Mixed Distillation Method for Non-Autoregressive Generation

Non-Autoregressive generation is a sequence generation paradigm, which r...

0 Weizhen Qi, et al. ∙

research

∙ 04/27/2022

DialogVED: A Pre-trained Latent Variable Encoder-Decoder Model for Dialog Response Generation

Dialog response generation in open domain is an important research topic...

0 Wei Chen, et al. ∙

research

∙ 04/01/2022

Distill-VQ: Learning Retrieval Oriented Vector Quantization By Distilling Knowledge from Dense Embeddings

Vector quantization (VQ) based ANN indexes, such as Inverted File System...

8 Shitao Xiao, et al. ∙

research

∙ 01/26/2022

CodeRetriever: Unimodal and Bimodal Contrastive Learning

In this paper, we propose the CodeRetriever model, which combines the un...

3 Xiaonan Li, et al. ∙

research

∙ 10/07/2021

Adversarial Retriever-Ranker for dense text retrieval

Current dense text retrieval models face two typical challenges. First, ...

4 Hang Zhang, et al. ∙

research

∙ 09/24/2021

Contextual Fine-to-Coarse Distillation for Coarse-grained Response Selection in Open-Domain Conversations

We study the problem of coarse-grained response selection in retrieval-b...

0 Wei Chen, et al. ∙

research

∙ 09/14/2021

KFCNet: Knowledge Filtering and Contrastive Learning Network for Generative Commonsense Reasoning

Pre-trained language models have led to substantial gains over a broad r...

5 Haonan Li, et al. ∙

research

∙ 06/08/2021

FastSeq: Make Sequence Generation Faster

Transformer-based models have made tremendous impacts in natural languag...

4 Yu Yan, et al. ∙

research

∙ 05/11/2021

EL-Attention: Memory Efficient Lossless Attention for Generation

Transformer model with multi-head attention requires caching intermediat...

0 Yu Yan, et al. ∙

research

∙ 05/10/2021

Poolingformer: Long Document Modeling with Pooling Attention

In this paper, we introduce a two-level attention schema, Poolingformer,...

16 Hang Zhang, et al. ∙

research

∙ 04/16/2021

ProphetNet-X: Large-Scale Pre-training Models for English, Chinese, Multi-lingual, Dialog, and Code Generation

Now, the pre-training technique is ubiquitous in natural language proces...

0 Weizhen Qi, et al. ∙

research

∙ 03/25/2021

Mask Attention Networks: Rethinking and Strengthen Transformer

Transformer is an attention-based neural network, which consists of two ...

0 Zhihao Fan, et al. ∙

research

∙ 12/19/2020

Uncertainty-Aware Label Refinement for Sequence Labeling

Conditional random fields (CRF) for label decoding has become ubiquitous...

0 Tao Gui, et al. ∙

research

∙ 12/01/2020

An Enhanced Knowledge Injection Model for Commonsense Generation

Commonsense generation aims at generating plausible everyday scenario de...

0 Zhihao Fan, et al. ∙

research

∙ 10/21/2020

ProphetNet-Ads: A Looking Ahead Strategy for Generative Retrieval Models in Sponsored Search Engine

In a sponsored search engine, generative retrieval models are recently p...

9 Weizhen Qi, et al. ∙

research

∙ 10/04/2020

Tell Me How to Ask Again: Question Data Augmentation with Controllable Rewriting in Continuous Space

In this paper, we propose a novel data augmentation method, referred to ...

0 Dayiheng Liu, et al. ∙

research

∙ 04/30/2020

RikiNet: Reading Wikipedia Pages for Natural Question Answering

Reading long documents to answer open-domain questions remains challengi...

0 Dayiheng Liu, et al. ∙

research

∙ 04/08/2020

Diverse, Controllable, and Keyphrase-Aware: A Corpus and Method for News Multi-Headline Generation

News headline generation aims to produce a short sentence to attract rea...

0 Dayiheng Liu, et al. ∙

research

∙ 04/03/2020

XGLUE: A New Benchmark Dataset for Cross-lingual Pre-training, Understanding and Generation

In this paper, we introduce XGLUE, a new benchmark dataset to train larg...

0 Yaobo Liang, et al. ∙

research

∙ 01/13/2020

ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training

In this paper, we present a new sequence-to-sequence pre-training model ...

0 Yu Yan, et al. ∙

research

∙ 09/12/2019

Neural Semantic Parsing in Low-Resource Settings with Back-Translation and Meta-Learning

Neural semantic parsing has achieved impressive results in recent years,...

0 Yibo Sun, et al. ∙

research

∙ 02/25/2019

Pretraining-Based Natural Language Generation for Text Summarization

In this paper, we propose a novel pretraining-based encoder-decoder fram...

8 Haoyu Zhang, et al. ∙

Yeyun Gong

Featured Co-authors

Sign in with Google

Consider DeepAI Pro