Font Size: a A A

The Identification Of The Boundary Of The Prepositional Phrase "GeiX" In The Format "GeiX+V"

Posted on:2009-01-09Degree:MasterType:Thesis
Country:ChinaCandidate:H Y ZhouFull Text:PDF
GTID:2155360245467193Subject:Linguistics and Applied Linguistics
Abstract/Summary:PDF Full Text Request
The construction of Chinese corpus is a systematic project, which can be divided into four levels: word automatic segmentation, part of speech tagging, analysis of syntax, syntactic analysis and tagging, semantic and pragmatic analysis and tagging. As to Chinese information processing, we should especially concentrate on solving the "sentence treatment" (syntactic analysis and tagging, semantic and pragmatic analysis and tagging )problem at present. But the preposition structure is a very important phrase structure in modern Chinese, and its automatic recognition has an important meaning for further syntactic analysis. Prepositions have generality and very strong personality. Therefore, we select the preposition "gei" to make a case analysis, trying to make a contribution to the automatic recognition and syntactic analysis of the preposition structure by automatic recognition of the preposition phrase "gei X" in the "gei X+V" format.Based on previous research, we make a corpus-based and detailed investigation on the "gei X+V" format and related problems:Chapter 1 mainly analyzes the anterior composition of "gei". The sequence "V gei NP" may have a possible ambiguity of structural delimitation, namely, V/gei NP and V gei/ NP. So if we want to identify the boundary of the preposition phrase "gei X", we should determine the boundary of the format firstly. We distinguished the anterior composition of "gei" in these different combinations, which are arranged in table forms.Chapter 2 mainly analyzes the structure related to "gei X". By analyzing corpus, we discover that once some verbs combined with simple directional verbs etc they will become a whole, and its ability of combination also takes a tremendous variety. When it is combined with "gei X", they will form a more fixed structure. In some special forms, the meaning of the preposition "gei" has bleached in a certain degree, but they are involved in this thesis, so we don't differ them. Chapter 3 mainly analyzes the core verb V in the format of "gei X+V". Through analysis and statistic, we know that "gei X" basically connects with the verb directly and examples with modifiers between "gei X" share 2.11%.The verbs in the format are mainly action verbs of two valences,the next are trivalent verbs,and one-valence verbs are fewer. We arrange these verbs in table forms which can't combine with "give X".Chapter 4 is the focus of this paper. We mainly give a full description and analysis of the syntactic form of "X" in the format of "gei X+V"."X" are substantive components basically, and 74.02% are single words, 25.98% complicated phrases. Some of the complicated phrases contain verbs, which have relatively obvious formal markers.Chapter 5 mainly designs the recognition algorithm of the preposition phrase of "gei X" based on ontology research and formal representation, then designs a flow chart of the recognition which makes the algorithm more visualize .Based on large-scale corpus and oriented to automatic computer recognition, we take methods of quantitative analysis and formal description of Data Statistic. Through marking, analyzing and labeling the corpus, a lot of corpus-based wordlists are gained which contribute to the boundary recognition of the prepositional phrase of "gei X"; more precisely, formal rules are refined, which provide formal representation for computers.There still exists some deficiency in this paper, therefore it needs further improvement.
Keywords/Search Tags:the format of "gei X+V", the preposition "gei", the syntactic form, the algorithm, the boundary recognition
PDF Full Text Request
Related items