Font Size: a A A

English Word Ambiguities And Upper And Lower Bounds Of WSD In English-Chinese Machine Translation

Posted on:2007-02-18Degree:MasterType:Thesis
Country:ChinaCandidate:S K QinFull Text:PDF
GTID:2155360212977768Subject:Linguistics and Applied Linguistics
Abstract/Summary:PDF Full Text Request
For the last 50 years, the linguists and the computer scientists have been endeavoring to solve the problem of understanding of language by computers. Presently, satisfactory results of automatic tokenization and automatic part-of-speech tagging have been obtained. But the computer still lacks the ability to correctly understand language meaning, which is greatly hindering natural language processing (NLP) from being improved. The good result of Word Sense Disambiguation (WSD) will surely enlighten every aspects of NLP, such as Machine Translation (MT), Text Classification, and Information Retrieval.We first explored and suggested the important aspects of English words that influence the result of WSD most. We then introduced the upper and lower bounds of WSD in English-Chinese MT. We also realized an existing WSD system and presented our analysis of the system to show what we did could provide in practice.What should be done clearly on the WSD task is a thorough investigation of the lexicon of a language. Availing ourselves of the traditional dictionaries, WordNet and some former studies, we analyzed the open class words, such as verbs, nouns, adjectives, adverbs, and pointed out that the most important part of English WSD is the most frequently used senses of the most frequently used verbs and nouns.Because two languages are involved in English-Chinese MT, the upper and lower bounds of WSD in single language is not suitable any more. The bounds for WSD in MT are needed. We calculated the Semantic Concordance of WordNet and provided the most likely sense which could be regarded as the lower bound of English WSD. We designed a questionnaire, taking the particularities of English-Chinese MT into consideration, and estimated an upper bound on performance by estimating the ability for human judges to agree with one another.The two studies above in practice could provide guidance in WSD system in English-Chinese MT. Therefore, we at last described our WSD test system based on an existing algorithm and analyzed the test results on the basis of the analyses above.
Keywords/Search Tags:Machine Translation, Word Sense Disambiguation, Upper and Lower Bounds
PDF Full Text Request
Related items