Font Size: a A A

The Bilingual Word Alignment - Based On The Chinese Model Of IBM

Posted on:2014-01-20Degree:MasterType:Thesis
Country:ChinaCandidate:H G RuanFull Text:PDF
GTID:2268330401973356Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Recently, statistical bilingual word alignment methods show great power in the field of Machine Translation (MT). A variety of word alignment method have been studied, such as Log-linear Models for Word Alignment, Statistical Machine Translation (SMT) Models, Linguistics analysis method,... Each method has its own characteristics, and they have shown good performance in different domains. In this thesis, research principle and engineering implementation of IBM Models’s. Futher, focuses on technical of word alignment. The work is summarized as Research of Chinese-Vietnamese Bilingual Word Alignment based on IBM Models, propose an of effect method. Firstly indentify the inconsistent parts between the bidirectional word alignments which are generated by IBM models and then use algorithm to optimize word alignment. In addition, this thesis propose a Chinese-Vietnam bilingual corpus to realize word alignment, based on IBM model has high accuracy of word alignment.In this thesis, used the statistical machine translation toolkit to train IBM Model1to Model5, used GIZA++[Och,2000; Och et al.,2003][36] open source code to word alignment generated. Word alignment experimental results show that this method is feasible, and reached the entry-level stage expected. The research provides a powerful platform in future.
Keywords/Search Tags:bilingual word alignment, bilingual corpus, IBM model
PDF Full Text Request
Related items