Font Size: a A A

Research On Optimization Technologies For Decoding In Phrase-Based Statistical Machine Translation

Posted on:2016-01-27Degree:MasterType:Thesis
Country:ChinaCandidate:Y T QuFull Text:PDF
GTID:2405330542457310Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Machine translation is an important branch in the field of natural language processing.It is the process of translating a natural language into another one.Phrase-based statistical machine translation model is the most widely used one.It has excellent performance and high robustness,which is a research hotspot of present statistical machine translation.The decoder is the core module of the phrase-based statistical machine translation system and the specific process of translation is accomplished through the decoder.The design and implementation of the decoder affect the translation performance and decoding speed of the system directly.In the process of decoding,only partial translation hypothesis can be seen,so it may lead to search errors,namely,the potential better translation hypothesis may be cut.The main content of this thesis is to optimize the decoding of the phrase-based statistical machine translation system,which aims to reduce the search errors of decoding and improve the performance of the translation system.It is divided into two aspects:(1)Decoding algorithm optimization:For the stack decoding algorithm,the reordering limits and the punctuation limits are used to improve the decoding speed and the performance of the original stack decoding algorithm.Meanwhile,the group pruning strategy is suggested to divide the more comparable translation hypothesis into one group.Then prune each hypothesis group.According to the distribution of each hypothesis group in the high quality candidate set,the preserving number of translation hypothesis of each corresponding hypothesis group is set.(2)Dynamic Discriminative Translation Model:The thesis proposes the dynamic discriminative translation model.The purpose is using more context information to evaluate the translation possibility of the phrase-pairs dynamically,thus making the translation system select the correct translation fragments which are more suitable for the context information.The core of the model is using word alignment information of bilingual sentence pairs to generate a large number of discriminative features,meanwhile,use the errors in the process of translation decoding to study the positive and negative instances.In this thesis,we use the neural network to train the dynamic discriminative translation model.The experimental results on large scale data show that the decoding optimization technologies in the thesis can reduce the search errors in decoding and improve the translation performance of phrase-based statistical machine translation system.
Keywords/Search Tags:statistical machine translation, phrase-based statistical machine translation, stack decoding algorithm, group pruning, dynamic discriminative translation model, neural network
PDF Full Text Request
Related items