Research On Optimization Technologies For Decoding In Phrase-Based Statistical Machine Translation

Posted on:2016-01-27

Degree:Master

Type:Thesis

Country:China

Candidate:Y T Qu

Full Text:PDF

GTID:2405330542457310

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Machine translation is an important branch in the field of natural language processing.It is the process of translating a natural language into another one.Phrase-based statistical machine translation model is the most widely used one.It has excellent performance and high robustness,which is a research hotspot of present statistical machine translation.The decoder is the core module of the phrase-based statistical machine translation system and the specific process of translation is accomplished through the decoder.The design and implementation of the decoder affect the translation performance and decoding speed of the system directly.In the process of decoding,only partial translation hypothesis can be seen,so it may lead to search errors,namely,the potential better translation hypothesis may be cut.The main content of this thesis is to optimize the decoding of the phrase-based statistical machine translation system,which aims to reduce the search errors of decoding and improve the performance of the translation system.It is divided into two aspects:(1)Decoding algorithm optimization:For the stack decoding algorithm,the reordering limits and the punctuation limits are used to improve the decoding speed and the performance of the original stack decoding algorithm.Meanwhile,the group pruning strategy is suggested to divide the more comparable translation hypothesis into one group.Then prune each hypothesis group.According to the distribution of each hypothesis group in the high quality candidate set,the preserving number of translation hypothesis of each corresponding hypothesis group is set.(2)Dynamic Discriminative Translation Model:The thesis proposes the dynamic discriminative translation model.The purpose is using more context information to evaluate the translation possibility of the phrase-pairs dynamically,thus making the translation system select the correct translation fragments which are more suitable for the context information.The core of the model is using word alignment information of bilingual sentence pairs to generate a large number of discriminative features,meanwhile,use the errors in the process of translation decoding to study the positive and negative instances.In this thesis,we use the neural network to train the dynamic discriminative translation model.The experimental results on large scale data show that the decoding optimization technologies in the thesis can reduce the search errors in decoding and improve the translation performance of phrase-based statistical machine translation system.

Keywords/Search Tags:

statistical machine translation, phrase-based statistical machine translation, stack decoding algorithm, group pruning, dynamic discriminative translation model, neural network

PDF Full Text Request

Related items

1	Research On Optimization Of Language Model Based On Statistical Machine Translation
2	Research On The Key Technologies For Phrase-based Tibetan-english Statistical Machine Translation
3	Research On Attention-Based Neural Machine Translation With Encoder-Decoder Architecture
4	Research On Domain Adaptation For Statistical Machine Translation Based On Topic And Semantic Analysis
5	Research And Implementation Of Neural Machine Translation Model Based On Fusion Of Dependency Syntactic Information
6	Topically-Informed Bilingually-Constrained Recursive Autoencoders For Statistical Machine Translation
7	Constructing An Integrated System： Rule-based And Statistics-based Machine Translation
8	Research And Implementation On English-Chinese Personal Name Transliteration Methods
9	A Sentence-level Quality Estimation For Neural Machine Translation Based On Subword Regularization
10	A Report On The Translation Of An Excerpt From Machine Translation