Large language models achieve state-of-the-art performance on sequence
g...
Applying Reinforcement Learning (RL) to sequence generation models enabl...
While state-of-the-art NLP models have demonstrated excellent performanc...
Parameter sharing has proven to be a parameter-efficient approach. Previ...
Deploying NMT models on mobile devices is essential for privacy, low lat...
Using translation memories (TMs) as prompts is a promising approach to
i...
Learning multiscale Transformer models has been evidenced as a viable
ap...
For years the model performance in machine learning obeyed a power-law
r...
Knowledge distillation addresses the problem of transferring knowledge f...
Improving machine translation (MT) systems with translation memories (TM...
In this paper, we propose a novel architecture, the Enhanced Interactive...
Multiscale feature hierarchies have been witnessed the success in the
co...
Previous work on multimodal machine translation (MMT) has focused on the...
This paper describes NiuTrans neural machine translation systems of the ...
This paper describes the submissions of the NiuTrans Team to the WNGT 20...
This paper describes the NiuTrans system for the WMT21 translation effic...
This paper addresses the efficiency challenge of Neural Architecture Sea...
Improving Transformer efficiency has become increasingly attractive rece...
This paper describes the submission of the NiuTrans end-to-end speech
tr...
Encoder pre-training is promising in end-to-end Speech Translation (ST),...
It has been found that residual networks are an Euler discretization of
...
The large attention-based encoder-decoder network (Transformer) has beco...
Recently, deep models have shown tremendous improvements in neural machi...
Unsupervised Bilingual Dictionary Induction methods based on the
initial...
Large amounts of data has made neural machine translation (NMT) a big su...
Traditional neural machine translation is limited to the topmost encoder...
The standard neural machine translation model can only decode with the s...
Deep encoders have been proven to be effective in improving neural machi...
Knowledge distillation has been proven to be effective in model accelera...
8-bit integer inference, as a promising direction in reducing both the
l...
In encoder-decoder neural models, multiple encoders are in general used ...
Neural architecture search (NAS) has advanced significantly in recent ye...
Neural machine translation systems require a number of stacked layers fo...
Though early successes of Statistical Machine Translation (SMT) systems ...
Recently, the Transformer machine translation system has shown strong re...
Word embedding is central to neural machine translation (NMT), which has...
Transformer is the state-of-the-art model in recent machine translation
...
This paper proposes a hierarchical attentional neural translation model ...