Agreement On Target-Bidirectional Neural Machine Translation

2 Dec

2.Their benefits of generating translations with good prefixes Our approach and good suffixes. To do this, we conduct two Kullback- To address the problem of exposure distortion, we try to divide the probability of simg-mize between the L2R and R2L translations, defined by L2R and R2L models, between the NMT-Zug-NMT models and divide the NMT training goal into a goal of regularization. Thus, we can not only minimize the maximum probability of default drive data, maximize the probability of drive data, but also minimize the regularization conditions that simultaneously indicate the divergence of the L2R and R2L models in the L2R and R2L models based on the current model – the latter is the measure of exposure distortion. In this section, we start with the basic model notes, problem of the model currently evaluated. This method then broadens discussions on model regulation concepts and the L2R model with the R2L model as effective gradient reconciliation methods. In the latter part, the support system and the R2L model can also be improved, as we show that the L2R and R2L-NMT models can be used in conjunction with the L2R model. We integrate enhanced R2L optimization to achieve better results. and the L2R models in a common drive frame in which they serve as support systems to each other, and the two mod-ratings els get further improvements with an interactive updated Source Game given x x (x1, x2, … , xT) and its process. → – Translation of lenses y (y1 , y2 , … , yT), leave P (yx; ) Our experiments are carried out on chinese and ←-English-German translation tasks and show that our and P (yx;) The L2R and R2L translation models are significantly superior to the state of the art in → – ← proposed method. Basic lines. In particular, the L2R translation model can be broken down as a →, i.e.

t → – P (yx; ) – t-1 P (yt-yt, x;) and uses later targets yt-1 , . . . . .

. . . . . . . . , yT to predict the current target yt x `(x1 , x2 , … , xT).

In practice, NMT systems are implemented at every stage t. ally with an attention-based set-top box architecture.