Font Size: a A A

Building A Chinese Dependency GraphBank

Posted on:2016-08-31Degree:MasterType:Thesis
Country:ChinaCandidate:C J XingFull Text:PDF
GTID:2295330470984029Subject:Linguistics and Applied Linguistics
Abstract/Summary:PDF Full Text Request
The natural language processing need to obtain the semantic relations between words from linear sentences. Tree syntactic structure played a significant role in natural language processing for that we generally can deduced the sentences’ semantic relations. But with the continuous expansion of corpus in recent years, researchers have found that using tree structure cannot fully describe syntactic structure, and exist a significant number of non-projective tree structures and graph structures. Meanwhile, the semantic role labeling and computational experiments also are challenging the dominance of the tree structures. A nominal composition can often act as the argument of multiple predicates, which makes the graph structure appeared in the argument structure of a sentence.Researchers proposed AMR (Abstract Meaning Representation) to analyze English, which is a method of sentence semantic representation based on graph. This paper attempt to establish a syntactic-semantic integration annotation scheme based on graph for Chinese and a Chinese Dependency GraphBank.Firstly, this paper summarize the theory of syntax and the development process of method of syntactic structure representation. We discover that tree structure can’t present the all syntactic and semantic relations when we analyze syntactic structure or arguments of sentences. Therefore, we need to establish a syntactic-semantic integration annotation scheme based on graph for Chinese.Secondly, this paper steeply build a new tagging set and make some principles for annotation base on that theoretical preparation.Thirdly, annotate part of Chinese data in evaluation of CoNLL2009a total of 1230 sentences by a toolkit. Certainly, we encountered some problems in annotating the sentences.Fourthly, statistical analysis the result of the annotation. Then we discover that there are 795 sentences contain graph structures in the whole 1230 sentences, which accounted for a large proportion of 64.6%.Then we analysis the special language phenomenon that lead to graph structures one by one.This paper demonstrate that there are a certain amount of sentences in Chinese that must be represented by graph structures, and these sentences always lead to the low precision of Chinese syntactic analysis. In addition, the procedure of annotation show that graph structures have obvious advantages to the tree structures in the aspect of presenting syntactic-semantic relations in a sentence.
Keywords/Search Tags:syntactic-semantic, Dependency Grammar, graph structures, annotation, GraphBank
PDF Full Text Request
Related items