Font Size: a A A

Ambiguity Word Processing Mechanism, The Combination Of Rules And Statistics

Posted on:2003-10-13Degree:MasterType:Thesis
Country:ChinaCandidate:L J ZhangFull Text:PDF
GTID:2208360065456066Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Part-of-speech tagging is a fundamental theme in natural language processing . It is significant to the tagging of Chinese corpus-based, machine translation and information indexing of large scale text.In this paper, we study the method of the part-of-speech tagging and analyze the rule method and the statistics method. Basing on it we bring forward the disambiguation strategy using rule techniques and statistics techniques .In rule model, the acqusition method of rules base is improved .We use the part-of-speech of syntactic category to replace the syntactic category .In addition, statistics method is used to help to construct the rule base. In statistics model, the concept of learning machine-made is presented .In according to the result of learning,the method of calculating transition probabilities and symbol probabilities are amended. With the above method, a system of disambiguation is materialized. The overall accuracy of close test is 97.85% and the accuracy of open test is 96.71% . The experimental results show the tagging accuracy and disambiguation accuracy are raised by using rule techniques and statistics techniques .
Keywords/Search Tags:Part-of-speech tagging, N-gram, rule, learning mechanism, syntactic category
PDF Full Text Request
Related items