Font Size: a A A

The Research About The Consraint Rules Of Syntax Relation In Cross-Punctuation Sentence In Written Mordern Chinese

Posted on:2008-05-18Degree:DoctorType:Dissertation
Country:ChinaCandidate:R P ZhangFull Text:PDF
GTID:1115360215981081Subject:Linguistics and Applied Linguistics
Abstract/Summary:PDF Full Text Request
Currently, the Chinese syntactic analysis is basically targeted at single sentence. However, the border of Chinese single sentence is very difficult to assure automatically in real corpus. The main form tag is punctuation sentence levels. The prerequisite of Chinese language processing is to formalize. So punctuation sentence become the basic units that computer processes Chinese sentence automatically. The border of Punctuation sentence is clear, but the syntactic elements of many of punctuation sentences is incomplete ,and we need to find them in context. But the problem of syntax analysis of inter - punctuation sentence is not systemic .This makes the parsing of Chinese Long Sentences and the generating of long sentences a poor result,and has become the most difficulty of foreign and Chinese machine translation and the deep-rooted understanding of Chinese Processing. To solve this problem, first, we must investigate the syntactic relations of Chinese-punctuation-sentences carefully and summed up some rules and constraints.This work is based on the theory framework of the punctuation sentence. The main purpose is to identify the common element in punctuation sentence, and in order to computer process punctuation sentences expediently ,we need find the formal binding rules besides the stack-type rules in the syntax relation. This work consists of two aspects:(1) mark the corps and make a survey and statistics We Marked the total of Qian Zhongshu's "WeiCheng", 22, 6641 words and 2,4115 punctuation sentence. The tags include the syntactic relations between punctuation sentence, the common ingredients, the shallow syntactic structure within the punctuation sentence and we gain the statistical data about each kind of punctuation sentence in marked Corpus. I also use text retrieval tools to do some specialized investigations and statistics on modern and contemporary Chinese novel of tens of millions of characters.(2) Mining the constraintsOn the basis of marked corpus and special investigations,we summed up various of constraints of punctuation sentence from about a hundred of big or small aspects . We focus on the punctuation sentences that yuanpei sentence and xupei sentence is homologous and ordinal .The contents include :whether the punctuation sentences whose beginning element is noun or pronoun miss subject.If structure of yuanpei-sentence is subject–verb-object,the subject of xupei-sentence is subject or object in yuanpei-sentence .We discusses these punctuation sentence whose yuanpei-sentence'predicate is sense verb ,"有",sentence-object verb,two-verb structure,"像","V着","V完"as well as the affect to common elements of relevance words, adverb, adjective and noun.How to identify the adverbial modifier of xupei-sentence,involving various forms of adverbial. We discuss the domain of negative word in punctuation sentence in a special chapter.How to identify the attribute of xupei-sentence, involving quantifiers, adjectives, pronouns, nouns and noun phrase.If yuanpei-sentence is把sentence and被sentence ,how to identify the common components in sentence.How to identify the overall or part of the noun phrase connected with"跟"in yuanpei-sentence is shared by xupei-sentence.If Yuanpei-sentence is jianyu-sentence, how to identify the common components in sentence.This work is characteristics in the following aspects:(1) About the scope of the study, in addition to previous studies about the subject-predicate punctuation sentence, We also studied the attribute-head punctuation sentence, adverb-head punctuation sentence ,predicate-object punctuation sentence ,predicate-complement punctuation sentence,preposition-object punctuation sentence, spreading completely the syntactic system research of the punctuation sentence.(2) About the research perspective, We focus on the formal features of constraints, so the studying results is convenient to operate, and lay a solid foundation for computer processing automatically.(3) About the research methods, besides examples ,we not only try to find the language cognitive reasons in traditional methods of self-examination, but also focus on the language phenomenon statistics in real Corpus, and look the statistical data as the reliability corroboration of the rules. In this paper, the major innovative features is the deep mining of the language features from many perspective. The main features are given in the following:If Yuanpei-sentence is the structure of"subject-verb-object",and the xupei-sentence lack of subject,how to identify the xupei-sentence uses the subject or object of yuanpei-sentence .Tthe paper pointed out several important differences features:To identify the subjects topic and the topic of object, one of the main indicators is static sentence and dynamic sentence ,and formally defined both punctuation sentence, pointing out the relation about the two kinds of punctuate sentences with the subject topic and the object topic.According to the affect of verbs to agentive nouns , verb is divided into the verbs only impacting on Agent nouns and verbs which will have an impact on the patient nouns to distinguish whether the subject convert or not.Put forward the concept of information ,and point out if yuanpei-sentence is"有"sentence and or xupei-sentence's predicate is middle-state adjective phrase,the confirming of the subject of xupei-sentence has relation to informativity of the object in yuanpei-sentence . The smaller the informativity of object is, the more likelihood object is the subject of xupei-sentence. We divided punctuation sentence into independent punctuation sentence and dependent punctuation sentence to judge whether the two punctuation sentences has relation .with each other.We divided nouns into independent and non-independent nouns overall to judge whether the punctuation sentence is integrated or not.For the punctuation sentence whose predicate has two vebs and has the relation of main-Vice , the paper used Sentence transform method to attribute them to single predicate sentence which is subject-verb-object and then confirm the subject of xupei-sentence.We divide verbs and adjectives predicate overall into directional predicate and non-directional predicate to settle the question whether overall parallel noun phrase is used or part or them is used.Put adverbial modifier into sentences adverbial modifier and lexical adverbial modifier to judge whether the adverbial modifier is shared.The above concepts and classifications were introduced for the first time in this paper.Make detailed classification to many words of each POS from semantic to resolve the confirming of common components in cross-sentence punctuation. Many of these parts has appeared in much linguistics literature, but the methods to define them and the purpose is different, Some of this is put forward the first time. This paper will use these word classes synthetically, and some have been redefined, and we given word list within high-frequency words. These include:Verb classes:existential-presentative verbs,pre- existential-presentative verbs,sensory verbs,cognitive verbs,mental verbs,motion-verbs,command verbs,body-motion verbs.Nouns classes :organ nouns,attribute nouns,family nouns,mental nouns; Adjective classes :dynamic adjective,static adjective,middle adjective; Adverb classes: momently-motion adverb,mental adverb,modal adverb,time adverb,conjunction adverb,scope adverb,extend adverb and so on; Put forward the concept of mental words,including mental nouns, mental verbs,mental adjective,mental adverb.The words classes put forward the first time in the paper are:organ nouns,middle adjective,momently adverb,mental adverb,mental nouns,mental wordsThe words classes which appear in linguistics literature but the the method of defined and domain is different sre: pre-existential-presentative verbs,body-action verb,dynamic adjective,static adjective.We also use parallel structure to settle the question.And so on. This work is very preliminary in the field of syntax relation of punctuation sentence. Due to time constraint, many of the issues are not mentioned, many of the problems have only the first step. Research results are more chaotic, not much systemic, not covered algorithm, the procedures. These will be gradually carried out in the future.
Keywords/Search Tags:Punctuation Sentences, Common Components, Syntax relation, Constraint Situations
PDF Full Text Request
Related items