Font Size: a A A

The Research Of Statistical Machine Translation Model Based On Deep Neural Network

Posted on:2017-11-05Degree:MasterType:Thesis
Country:ChinaCandidate:J X LiFull Text:PDF
GTID:2348330503486899Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the increasing of the Internet application, network interaction more frequent, resources in the Internet present the explosive growth. Under the impact of this wave, methods based on big data have been proposed, such as deep learning, scholars began to think about these tasks from a new perspective. The research direction of this topic is to design and modeling problems of traditional machine translation, and use the deep neural network to solve the problem.Through the research of traditional statistical machine translation method, as well as application analysis of the deep learning in natural language processing, this subject aims to establish a joint of neural network model to cover the whole process of translation, from the perspective of End-to-End directly to complete the translation task. The improvement of the previous model in machine translation is usually replace one child module with neural network, but this topic combined all child module modeling, the machine translation process is decomposed into two encoder and decoder module. On top of this, it optimized the language model, word alignment section and generating algorithm to the output module, moreover put forward a kind of RNN-embed machine translation model that based on recursive neural network. The biggest advantage of this model is don’t need to bilingual word processing, parallel bilingual data will be seen as two highly correlated sequence with time order. All of the data have level of character input in deep neural network training, solve high semantic intractable difficulties of text inf ormation in deep neural network.Based on the above model, this project implements a pluggable universal framework of machine translation, and with multithreading, calculation of GPU to accelerate the training process. Because there is no suitable public data sets, then through crawling English-Chinese subtitle data in website, and generating more than ten million of bilingual sentences after pretreatment, part of them used as the experimental data. Finally, through the experiment, compared with the authoritative Moses statistical machine translation system, this topic of RNN-embed machine translation model greatly reduce the training time, lower the complexity of translation process set up at the same time. And with the best of the machine translation model based on neural network, the model study of this subject is so lved the problem of big dictionary based on the segmentation of the input data, and further improve the effect of the translation of long sentences.
Keywords/Search Tags:statistical machine translation, deep learning, recursive neural network
PDF Full Text Request
Related items