b'Tong Xiao'

research

∙ 08/08/2023

Learning Evaluation Models from Large Language Models for Sequence Generation

Large language models achieve state-of-the-art performance on sequence g...

0 Chenglong Wang, et al. ∙

research

∙ 08/04/2023

ESRL: Efficient Sampling-based Reinforcement Learning for Sequence Generation

Applying Reinforcement Learning (RL) to sequence generation models enabl...

0 Chenglong Wang, et al. ∙

research

∙ 06/24/2023

Towards Robust Aspect-based Sentiment Analysis through Non-counterfactual Augmentations

While state-of-the-art NLP models have demonstrated excellent performanc...

0 Xinyu Liu, et al. ∙

research

∙ 06/15/2023

Understanding Parameter Sharing in Transformers

Parameter sharing has proven to be a parameter-efficient approach. Previ...

0 Ye Lin, et al. ∙

research

∙ 06/07/2023

MobileNMT: Enabling Translation in 15MB and 30ms

Deploying NMT models on mobile devices is essential for privacy, low lat...

0 Ye Lin, et al. ∙

research

∙ 05/27/2023

Augmenting Large Language Model Translators via Translation Memories

Using translation memories (TMs) as prompts is a promising approach to i...

0 Yongyu Mu, et al. ∙

research

∙ 05/26/2023

TranSFormer: Slow-Fast Transformer for Machine Translation

Learning multiscale Transformer models has been evidenced as a viable ap...

0 Bei Li, et al. ∙

research

∙ 05/10/2023

Multi-Path Transformer is Better: A Case Study on Neural Machine Translation

For years the model performance in machine learning obeyed a power-law r...

0 Ye Lin, et al. ∙

research

∙ 03/20/2023

Character, Word, or Both? Revisiting the Segmentation Granularity for Chinese Pre-trained Language Models

Pretrained language models (PLMs) have shown marvelous improvements acro...

0 Xinnian Liang, et al. ∙

research

∙ 02/09/2023

A Novel Approach for Auto-Formulation of Optimization Problems

In the Natural Language for Optimization (NL4Opt) NeurIPS 2022 competiti...

0 Yuting Ning, et al. ∙

research

∙ 02/01/2023

Improved Knowledge Distillation for Pre-trained Language Models via Knowledge Selection

Knowledge distillation addresses the problem of transferring knowledge f...

0 Chenglong Wang, et al. ∙

research

∙ 01/13/2023

Prompting Neural Machine Translation with Translation Memories

Improving machine translation (MT) systems with translation memories (TM...

0 Abudurexiti Reheman, et al. ∙

research

∙ 12/20/2022

EIT: Enhanced Interactive Transformer

In this paper, we propose a novel architecture, the Enhanced Interactive...

0 Tong Zheng, et al. ∙

research

∙ 06/19/2022

Learning Multiscale Transformer Models for Sequence Generation

Multiscale feature hierarchies have been witnessed the success in the co...

0 Bei Li, et al. ∙

research

∙ 03/17/2022

On Vision Features in Multimodal Machine Translation

Previous work on multimodal machine translation (MMT) has focused on the...

0 Bei Li, et al. ∙

research

∙ 09/22/2021

The NiuTrans Machine Translation Systems for WMT21

This paper describes NiuTrans neural machine translation systems of the ...

0 Shuhan Zhou, et al. ∙

research

∙ 09/16/2021

The NiuTrans System for WNGT 2020 Efficiency Task

This paper describes the submissions of the NiuTrans Team to the WNGT 20...

0 Chi Hu, et al. ∙

research

∙ 09/16/2021

The NiuTrans System for the WMT21 Efficiency Task

This paper describes the NiuTrans system for the WMT21 translation effic...

0 Chenglong Wang, et al. ∙

research

∙ 09/15/2021

RankNAS: Efficient Neural Architecture Search by Pairwise Ranking

This paper addresses the efficiency challenge of Neural Architecture Sea...

0 Chi Hu, et al. ∙

research

∙ 09/09/2021

Bag of Tricks for Optimizing Transformer Efficiency

Improving Transformer efficiency has become increasingly attractive rece...

0 Ye Lin, et al. ∙

research

∙ 07/06/2021

The NiuTrans End-to-End Speech Translation System for IWSLT 2021 Offline Task

This paper describes the submission of the NiuTrans end-to-end speech tr...

0 Xiaoqian Liu, et al. ∙

research

∙ 05/12/2021

Stacked Acoustic-and-Textual Encoding: Integrating the Pre-trained Models into Speech Translation Encoders

Encoder pre-training is promising in end-to-end Speech Translation (ST),...

0 Bojie Hu, et al. ∙

research

∙ 04/06/2021

ODE Transformer: An Ordinary Differential Equation-Inspired Model for Neural Machine Translation

It has been found that residual networks are an Euler discretization of ...

0 Bei Li, et al. ∙

research

∙ 03/21/2021

Non-Autoregressive Translation by Learning Target Categorical Codes

Non-autoregressive Transformer is a promising text generation model. How...

0 Yu Bao, et al. ∙

research

∙ 01/03/2021

An Efficient Transformer Decoder with Compressed Sub-layers

The large attention-based encoder-decoder network (Transformer) has beco...

0 Yanyang Li, et al. ∙

research

∙ 12/27/2020

Learning Light-Weight Translation Models from Deep Transformer

Recently, deep models have shown tremendous improvements in neural machi...

19 Bei Li, et al. ∙

research

∙ 11/30/2020

A Simple and Effective Approach to Robust Unsupervised Bilingual Dictionary Induction

Unsupervised Bilingual Dictionary Induction methods based on the initial...

0 Yanyang Li, et al. ∙

research

∙ 11/30/2020

Dynamic Curriculum Learning for Low-Resource Neural Machine Translation

Large amounts of data has made neural machine translation (NMT) a big su...

0 Bojie Hu, et al. ∙

research

∙ 11/03/2020

Layer-Wise Multi-View Learning for Neural Machine Translation

Traditional neural machine translation is limited to the topmost encoder...

0 Qiang Wang, et al. ∙

research

∙ 10/16/2020

Training Flexible Depth Model by Multi-Task Learning for Neural Machine Translation

The standard neural machine translation model can only decode with the s...

0 Qiang Wang, et al. ∙

research

∙ 10/08/2020

Shallow-to-Deep Training for Neural Machine Translation

Deep encoders have been proven to be effective in improving neural machi...

0 Bei Li, et al. ∙

research

∙ 09/19/2020

Weight Distillation: Transferring the Knowledge in Neural Network Parameters

Knowledge distillation has been proven to be effective in model accelera...

0 Ye Lin, et al. ∙

research

∙ 09/17/2020

Towards Fully 8-bit Integer Inference for the Transformer Model

8-bit integer inference, as a promising direction in reducing both the l...

0 Ye Lin, et al. ∙

research

∙ 07/17/2020

Geometric Correspondence Fields: Learned Differentiable Rendering for 3D Pose Refinement in the Wild

We present a novel 3D pose refinement approach based on differentiable r...

0 Alexander Grabner, et al. ∙

research

∙ 06/25/2020

Towards Differentially Private Text Representations

Most deep learning frameworks require users to pool their local data or ...

0 Lingjuan Lyu, et al. ∙

research

∙ 05/27/2020

Attention: to Better Stand on the Shoulders of Giants

Science of science (SciSci) is an emerging discipline wherein science is...

18 Sha Yuan, et al. ∙

research

∙ 05/07/2020

Does Multi-Encoder Help? A Case Study on Context-Aware Neural Machine Translation

In encoder-decoder neural models, multiple encoders are in general used ...

0 Bei Li, et al. ∙

research

∙ 05/06/2020

Learning Architectures from an Extended Search Space for Language Modeling

Neural architecture search (NAS) has advanced significantly in recent ye...

0 Yinqiao Li, et al. ∙

research

∙ 03/31/2020

DeepGS: Deep Representation Learning of Graphs and Sequences for Drug-Target Binding Affinity Prediction

Accurately predicting drug-target binding affinity (DTA) in silico is a ...

0 Xuan Lin, et al. ∙

research

∙ 02/16/2020

Multi-layer Representation Fusion for Neural Machine Translation

Neural machine translation systems require a number of stacked layers fo...

0 Qiang Wang, et al. ∙

research

∙ 02/16/2020

Neural Machine Translation with Joint Representation

Though early successes of Statistical Machine Translation (SMT) systems ...

0 Yanyang Li, et al. ∙

research

∙ 09/08/2019

Ultra-broadband active noise cancellation at the ears via optical microphones

High frequency noise has generally been difficult to be cancelled active...

0 Tong Xiao, et al. ∙

research

∙ 09/08/2019

Ultra-broadband active noise cancellation at the ears via optical error sensing

High frequency noise has been difficult to cancel actively at a person's...

0 Tong Xiao, et al. ∙

research

∙ 06/26/2019

Sharing Attention Weights for Fast Transformer

Recently, the Transformer machine translation system has shown strong re...

0 Tong Xiao, et al. ∙

research

∙ 06/07/2019

Shared-Private Bilingual Word Embeddings for Neural Machine Translation

Word embedding is central to neural machine translation (NMT), which has...

0 Xuebo Liu, et al. ∙

research

∙ 06/05/2019

Learning Deep Transformer Models for Machine Translation

Transformer is the state-of-the-art model in recent machine translation ...

0 Qiang Wang, et al. ∙

research

∙ 11/06/2018

Modeling and Predicting Citation Count via Recurrent Neural Network with Long Short-Term Memory

The rapid evolution of scientific research has been creating a huge volu...

0 Sha Yuan, et al. ∙

research

∙ 07/30/2018

End-to-End Deep Kronecker-Product Matching for Person Re-identification

Person re-identification aims to robustly measure similarities between p...

0 Yantao Shen, et al. ∙

research

∙ 07/30/2018

Deep Group-shuffling Random Walk for Person Re-identification

Person re-identification aims at finding a person of interest in an imag...

0 Yantao Shen, et al. ∙

research

∙ 08/13/2017

Learning Deep Neural Networks for Vehicle Re-ID with Visual-spatio-temporal Path Proposals

Vehicle re-identification is an important problem and has many applicati...

0 Yantao Shen, et al. ∙

Tong Xiao

Featured Co-authors

Sign in with Google

Consider DeepAI Pro