Font Size: a A A

The Annotation Of Empty Categories In The Chinese Treebank

Posted on:2013-09-26Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhangFull Text:PDF
GTID:2235330371490917Subject:English Language and Literature
Abstract/Summary:PDF Full Text Request
This paper firstly creates a multi-representational and multi-layered chinesetreebank.“Multi-layered” means that the treebank has different layers ofrepresentation: we represent both syntax and lexical predicate-argument structure.“Multi-representational” means that we use both dependency and phrase structure forsyntactic representation. The whole chinese treebank has three main sublayers: phrasestructure, predicate-argument structure (PropBank), and dependency structure. Then,on the basis of this multi-layered and multi-representational chinese treebank, it addsup an annotation of empty categories (ECs) which is an important issue in treebankdesign but often neglected by computational linguists.Firstly, we give a detailed introduction to the existed treebanks and theirannotation schemas. Although the ECs are a common representational device fortreebanking, they are often not specifically motivated. So the current annotationschemas are seldom taking the ECs into consideration. After the introduction of thetreebanks, we go to the detailed reference of theories about chinese ECs by NoamChomsky, James Huang, Xu Liejiong and Shenyang. These theories provide us thebasic motivation for the addition of ECs in treebanks annotation schema.Secondly, this paper carefully explains the theoretical basis for the automaticannotation of ECs and their processing. The ECs Principle proposed by NoamChomsky shows us the features of different types of ECs and their limitation. This canexamine our choice of EC types and our decisions with respect to whether or not tocoindex them with other elements in the representation. It also gives the reasons whycertain types of ECs are introduced at different stages during the annotation process.Thirdly, we have confirmed the use of different types of ECs in all three levelsof representation, however, in different grammatical theories, ECs types are classifiedaccording to various criteria. So we make a clear distinction between two types ofECs, trace and silent, on the basis of whether they are postulated to markdisplacement or not. Each type is further refined into several subtypes based on the underlying linguistic phenomena of ECsAfter investigating methodically the different types of ECs and their role in oursyntactic and semantic representations, we list a table of ECs in chinese treebankwhich summarizes all types of ECs used in our project. We then give out our use ofECs in this project in chapter four, showing how they are annotated and work at thethree layers.
Keywords/Search Tags:chinese treebank, empty categories, annotation schema
PDF Full Text Request
Related items