| With the development of internet, there are enormous information or documentneed exchange between people from different language. Machine translation as anefficient tool for translating between different languages has made significantimprovement in recent years. A lot of web-based translation services have beenpublished by Commercial Corporation and facilitate translation process for users. Eventhough translation quality of state-of-the-art technique can fulfill daily work’s purpose,but translation quality still need further improvement. Machine translation systemcombination technique emerged as an effective way to improve performance ofautomatic speech recognition and was introduced to the field of machine translation in2006. Since then, its ability has been proved by results of contest and real-lifeapplication as a widely used technique. Around system combination technique, thisstudy involves following aspects:(1) Present analysis on the factors that would impact the performance of systemcombination. We define the oracle score of system combination which defined ourmetric of system combination performance. Analysis on the impact of the number ofindividual system involved combination is presented at first. Then presents thecomparison between single source and multiple source system combination, from theresult we know that oracle score of multiple source combination is better than singlesource combination which means better potential performance improvement. In the end,we give the BLEU score of sentence level and word level system combination result.Experiment result shows the superior of word level combination over sentence levelcombination.(2) Study the selection method for system combination candidates. Under mostcircumstances, there are tens combination candidates for each source sentence. Using allthe available combination candidates as input doesn’t necessarily lead to the bestperformance. We need to select candidates with high translation quality among all thecandidates. In this paper, we perform selection based on machine learning technique. Astatistical model is learned from the training set and used to produce ranking for the testset. We perform combination using candidates with increasing rank; experiment resultshows performance increment when using less combination candidates.(3) Confusion network decoding with online language model. Decoding is one ofthe most important steps in word level system combination and features used inlog-linearly model can have impact on the performance of combination. Widely usedfeatures include language model score, ngram-count matches, the number of words inhypothesis and word confidence score. The language model score used in this context is trained from large corpus which used to measure the fluency of hypothesis.Ngram-count matches feature is a typical local information the can be obtained formhypothesis. Experimental results shows that ngram-count matches can improvecombination performance significantly. For this reason, we keep on adding more localinformation to the decoding of confusion network. New local features includes locallanguage model, skip-gram model and word posterios probability. For each features, wepresents the analysis of the impact on combination performance and also gives analysison the combination of features.For system combination, we present an emprical study on the factors that canimpact combination performance. Then improve combination performance by filteringout translation candidates with low translation quality. And add more local features tolog-linear model used in system combination. |