Large language models achieve state-of-the-art performance on sequence
g...
Applying Reinforcement Learning (RL) to sequence generation models enabl...
While state-of-the-art NLP models have demonstrated excellent performanc...
Parameter sharing has proven to be a parameter-efficient approach. Previ...
Deploying NMT models on mobile devices is essential for privacy, low lat...
Using translation memories (TMs) as prompts is a promising approach to
i...
Learning multiscale Transformer models has been evidenced as a viable
ap...
For years the model performance in machine learning obeyed a power-law
r...
Pretrained language models (PLMs) have shown marvelous improvements acro...
In the Natural Language for Optimization (NL4Opt) NeurIPS 2022 competiti...
Knowledge distillation addresses the problem of transferring knowledge f...
Improving machine translation (MT) systems with translation memories (TM...
In this paper, we propose a novel architecture, the Enhanced Interactive...
Multiscale feature hierarchies have been witnessed the success in the
co...
Previous work on multimodal machine translation (MMT) has focused on the...
This paper describes NiuTrans neural machine translation systems of the ...
This paper describes the submissions of the NiuTrans Team to the WNGT 20...
This paper describes the NiuTrans system for the WMT21 translation effic...
This paper addresses the efficiency challenge of Neural Architecture Sea...
Improving Transformer efficiency has become increasingly attractive rece...
This paper describes the submission of the NiuTrans end-to-end speech
tr...
Encoder pre-training is promising in end-to-end Speech Translation (ST),...
It has been found that residual networks are an Euler discretization of
...
Non-autoregressive Transformer is a promising text generation model. How...
The large attention-based encoder-decoder network (Transformer) has beco...
Recently, deep models have shown tremendous improvements in neural machi...
Unsupervised Bilingual Dictionary Induction methods based on the
initial...
Large amounts of data has made neural machine translation (NMT) a big su...
Traditional neural machine translation is limited to the topmost encoder...
The standard neural machine translation model can only decode with the s...
Deep encoders have been proven to be effective in improving neural machi...
Knowledge distillation has been proven to be effective in model accelera...
8-bit integer inference, as a promising direction in reducing both the
l...
We present a novel 3D pose refinement approach based on differentiable
r...
Most deep learning frameworks require users to pool their local data or ...
Science of science (SciSci) is an emerging discipline wherein science is...
In encoder-decoder neural models, multiple encoders are in general used ...
Neural architecture search (NAS) has advanced significantly in recent ye...
Accurately predicting drug-target binding affinity (DTA) in silico is a ...
Neural machine translation systems require a number of stacked layers fo...
Though early successes of Statistical Machine Translation (SMT) systems ...
High frequency noise has generally been difficult to be cancelled active...
High frequency noise has been difficult to cancel actively at a person's...
Recently, the Transformer machine translation system has shown strong re...
Word embedding is central to neural machine translation (NMT), which has...
Transformer is the state-of-the-art model in recent machine translation
...
The rapid evolution of scientific research has been creating a huge volu...
Person re-identification aims to robustly measure similarities between p...
Person re-identification aims at finding a person of interest in an imag...
Vehicle re-identification is an important problem and has many applicati...