Font Size: a A A

Building A Treebank For English-Chinese Machine Translation

Posted on:2008-11-22Degree:MasterType:Thesis
Country:ChinaCandidate:X F MuFull Text:PDF
GTID:2155360215981105Subject:Linguistics and Applied Linguistics
Abstract/Summary:PDF Full Text Request
Great progress has been made in MT(machine translation) research in thepast sixty years, either in application development or in theory innovation. And some MT applications even entered into people's daily lives, such as Google online translation.In spite of great success, the development of MT is still under the constraint of semantics , which includes multi-meaning of words and phrasal structures. On account of the complexity of semantics, NLP researchers resort to many means to avoid the computing of semantics, such as bilingual alignment. Bilingual alignment is a much popular and effective way to extract knowledge of language, which includes word alignment and phrase alignment. But since free translation of text, paragraph dislocation and missed translation usually happen to bilingual text translation. And also, the ambiguity of translation and syntactic position, high and low frequent string, covering scale of dictionary and word out of vocabulary all make the knowledge extracted from bilingual text very rough and syntactic elements in one language can't find the counterparts in the other language.Taking into account the problem of semantics and the advantage and disadvantage of bilingual alignment, we have built a Treebank for English-Chinese MT. The main features of the Treebank include the parsing of English sentence, the modification of some phrasal structures. And more, all English words have been translated into Chinese words by some principles and the translation of words will be passed upward to the root node where we can get the translation of the whole English sentence. Consequently, the Chinese translation is attached to all nodes of parsing tree of English sentence and the two different types of language have also been set in the same grammar system. And a great deal of translation model have been extracted which are very useful in syntactic parsing and translation production.
Keywords/Search Tags:machine translation, bilingual alignment, Treebank, guideline, evaluation
PDF Full Text Request
Related items