Font Size: a A A

Building A Chinese Semantic Resource Based On Feature Structure

Posted on:2012-12-26Degree:DoctorType:Dissertation
Country:ChinaCandidate:B ChenFull Text:PDF
GTID:1225330467968351Subject:Linguistics and Applied Linguistics
Abstract/Summary:PDF Full Text Request
Semantic parsing of large scale Chinese natural texts has been a difficulty in natural language processing. Traditional annotating methods have encountered dilemma in tackling Chinese special sentence patterns such as subject-predicate predicate sentences and serial verb sentences. The construction of annotated corpus based on semantic methods has laid a solid foundation for natural language processing and application.We proposed a novel semantic parsing model Feature Structure in the analysis of Chinese sentences and built a large scale corpus based on this method by taking30,000sentences out of news data. We further analyzed the special Chinese sentences patterns such as subject-predicate predicate sentences and serial verb sentences which posed a controversy for ages. Our research showed that Feature Structure involved more semantic relatedness than traditional dependency methods when used in analyzing Chinese sentences. In addition, it can interpret the domain, type and characteristics Chinese sentences possess within linguistic theoretical sphere. This research not only offered a novel semantic parsing method, but supplied’ research communities with semantic resources as well. Meanwhile it gained an insight into some Chinese linguistic problems.The dissertation is comprised of six chapters.Chapter one is the introduction. It involves the background of this research, the state of the art at home and abroad, the definition and an overview.Chapter two illustrates the feature structure model. It includes the definition to feature structure and its characteristics. In general, feature structure is used to display conceptual relatedness and relatedness type. And it can be represented via triple which consists of entity, feature and its value. The triples allow for multiple relatedness and cross-relatedness, that is, embedding and recursive. The formal representation of triple is undirected graph. The feature structure can be judged by interrogation. We analyzed the conditions in which the questions can be asked, the components the questions aimed at and the distribution of feature words. Feature Structure triples fall into six categories.Chapter four is concerned about the construction of Chinese sentential resources based on feature structure. The raw materials were taken from Penn Treebank, Chinese newspaper articles within three years and texts from secondary school and primary school. The annotating approaches combined human manual annotating with software. We devised language annotating platform for this task. We discussed criteria in annotating the materials. The focus of this chapter is on the annotating criteria.Chapter four is case study on the Chinese subject-predicate predicate sentences based on feature structure. This chapter reviews major content and the controversies arising from subject-predicate predicate sentences. We analyzed the difficulties in annotating these sentences and categorized these sentence patterns into six types. We conduct semantic description on these sentences based on feature structure. We also compare feature structure with traditional dependency parsing methods and found that feature structure could display more semantic information.Chapter five is case study on the Chinese serial verb sentences based on feature structure. This chapter reviews major content and the controversies arising from serial verb sentences. We analyze the difficulties in annotating these sentences and categorize these sentence patterns into four types. We compare feature structure with traditional dependency methods in annotating these sentences and find that feature structure outperforms traditional dependency methods in that feature structure can represent the semantic relatedness between the subject and the second verb within the serial verb sentence. However, traditional dependency methods failed to represent the semantic relatedness. Thus we can draw conclusion that feature structure contains more semantic information. In addition, feature structure can be used to account for some linguistic phenomena which traditional dependency methods failed to explain. So to speak, we cannot judge what serial verb sentences are, nor can we distinguish between serial verb sentence and condensed complex sentence by traditional dependency methods. In a word, feature structure promotes linguistic theory and provides a novel parsing method in Chinese language processing.Chapter six concludes the dissertation. It includes the evaluation of feature structure, research highlight, value in application and further study.The originality of this dissertation lies in three aspects:1. Feature Structure model is proposed to account for linguistic phenomena and controversies in Chinese.2. A large scale resources based on feature structure are built.Feature Structure is applied to tackle Chinese special sentence patterns and achieve a satisfactory effect.
Keywords/Search Tags:Feature Structure, semantic labeling, Feature Triple, semanticresource, semantic parsing
PDF Full Text Request
Related items