Font Size: a A A

Character-level Neural Machine Translation

Posted on:2019-08-10Degree:MasterType:Thesis
Country:ChinaCandidate:S J ZhaoFull Text:PDF
GTID:2428330590967374Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the increase of global communication,the language barrier become a problem to be solved urgently.In the current situation,machine translation is a effective ways to overcome obstacles.Machine translation originated in the 17 th century,and developed in the late 19 th century as statistical machine translation.With the development of neural networks and deep learning techniques,machine translation has also adopts these technique and made new progress.Neural machine translation aims at building a single neural network that can be jointly trained to maximize translation performance.The encoderdecoder architecture with the attention mechanism achieves a translation performance comparable to the existing phrase based systems.However,the use of large vocabulary has become one of the bottlenecks in neural machine translation.For instance,word level neural machine translation performs worse on Czech language.Because Czech is a Slavic language with not only rich and complex inflection but also fusional morphology.We propose an efficient character level neural machine translation in order to handle these languages better.We introduce a hierarchical encoder and a hierarchical decoder which constitute the deep character level neural machine translation(DCNMT).The hierarchical encoder starts from encoding character sequences to obtain word-level representation,then learns semantic feature from word-level representation.The hierarchical decoder decodes at character level after obtaining the word-level representation.Such a deep model has two major advantages.It avoids the large vocabulary issue radically;at the same time,it is much faster than conventional character-based models.More interestingly,our model is able to translate the misspelled word like human beings.Since many languages originated from a common ancestral language and influence each other,there would inevitably exist similarities between these languages such as lexical similarity and named entity similarity.We further leverage these similarities to improve the translation performance in our character-level neural machine translation.Specifically,we introduce an attention-via-attention(AvA)mechanism that guides the information of source side characters flowing to the target side directly.With this mechanism,the target-side characters will be generated based on the representation of source-side characters when the words are similar.For instance,our proposed neural machine translation system learns to transfer the character-level information of the English word ‘system' through the attention-via-attention mechanism to generate the Czech word‘systém'.The burden of decoder would be eased.Consequently,our approach is able to not only achieve a prominent translation performance,but also reduce the model size significantly.We train the models on the English-French and English-Czech translation tasks.We conduct comparison with various strong baselines including RNNsearch,bpe2 char models,char2 char models and hybrid models.We evaluate the effectiveness and the efficiency of the proposed through quantitative results and qualitative analysis.
Keywords/Search Tags:Character level, Machine translation, Neural network, Attention Mechanism
PDF Full Text Request
Related items