Generative Bridging Network in Neural Sequence Prediction
Maximum Likelihood Estimation (MLE) suffers from data sparsity problem in sequence prediction tasks where training resource is rare. In order to alleviate this problem, in this paper, we propose a novel generative bridging network (GBN) to train sequence prediction models, which contains a generator and a bridge. Unlike MLE directly maximizing the likelihood of the ground truth, the bridge extends the point-wise ground truth to a bridge distribution (containing inexhaustible examples), and the generator is trained to minimize their KL-divergence. In order to guide the training of generator with additional signals, the bridge distribution can be set or trained to possess specific properties, by using different constraints. More specifically, to increase output diversity, enhance language smoothness and relieve learning burden, three different regularization constraints are introduced to construct bridge distributions. By combining these bridges with a sequence generator, three independent GBNs are proposed, namely uniform GBN, language-model GBN and coaching GBN. Experiment conducted on two recognized sequence prediction tasks (machine translation and abstractive text summarization) shows that our proposed GBNs can yield significant improvements over strong baseline systems. Furthermore, by analyzing samples drawn from bridge distributions, expected influences on the sequence model training are verified.
READ FULL TEXT