Font Size: a A A

Research On Chinese Maximal Noun Phrases Recognition

Posted on:2008-05-15Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y WangFull Text:PDF
GTID:2178360245998035Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Noun phrase is an important component part of a text. Recognizing the noun phrases from a text can help grasp the main meaning what the text expresses. And the Chinese maximal noun phrases (MNP) recognition is also helpful as noun phrases recognition does. Many natural language applications, such as information retrieval, text classification, automatic summarization and anaphora resolution would benefit a lot from Chinese maximal noun phrases recognition .So the study in MNP is essential. In this thesis, attention is concentrated on the Chinese noun phrases recognition. And the MNP recognition is implemented with two methods, which are HMM and Conditional Random Field, and the test result is analyzed.First, this thesis identifies the Chinese MNP with HMM. This thesis applies traditional HMM and 2-order HMM to Chinese maximal noun phrases recognition. Because 2-order HMM consider the information of the previous state to get more predictions, the test result of 2-order HMM is better than the traditional HMM. Because the drawbacks of the HMM itself, the overall result of HMM is not good enough. According the limitations of HMM, then the thesis adopts the Condition Random Field (CRF) method to identify the Chinese MNPs from the text, which shows more advantages in MNP recognition than HMM does. HMM is the model with strong independence assumption, but the CRF allows arbitrary dependencies on the observation sequence, in addition ,the features do not need to specify completely a state or observation. The result shows the recognition method based on CRF is much more ideal.In this thesis, the MNP recognition is applied to specific task-oriented anaphora resolution. Chinese MNP includes the modifier of the central noun phrase of the MNP, which may specific much information, such as the gender, color, number, date and orientation etc. In the specific task, anaphora resolution needs gender and number information about the central noun of the MNP. So the MNP recognition can extract the useful information from the text, and apply the MNP recognition in the Anaphora resolution.
Keywords/Search Tags:noun phrases, maximal noun phrases recognition, HMM, CRF
PDF Full Text Request
Related items