Font Size: a A A

Analysis And Marking Research On Multi - Verb Chinese Concept Compound Block

Posted on:2016-04-16Degree:MasterType:Thesis
Country:ChinaCandidate:Y X WuFull Text:PDF
GTID:2278330503960866Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of science and technology progress day by day. Data plays an increasingly important role in people’s daily lives. As the focus of research in natural language processing, syntactic analysis is also increasingly being valued by scholars. Parsing is an important research field of natural language processing, but it is also difficult. Complete syntactic analysis of Chinese sentences have greater difficulty, the analysis method is still in its infancy stage, difficult to apply in practice. In order to reduce the difficulty of completely parsing operations, while providing data support for research at the present stage, the chunking technique favored by the researchers. Chunk analysis is the use of thinking of "divide and conquer", the modular complex issues, the main problems concern are: Determine the particle size of the block division, structure represents of inner-block and extern-block. Thus, the concept of composite blocks described system have been proposed to describe the basic structure of the sentence, as well as the internal structure of each block. Thus, the Concept Compound Chunk described system have been proposed to describe the basic structure of the sentence, as well as the internal structure of each block. The current parser are common type of analysis in the ordinary sentences and the general ideal of a simple sentence, and if the sentence complex, such as contains multi-verbs, the result is not ideal analyzer. The reason is that the analysis of the verb analyzer location is not accurate, causing ingredients division error, and there is little domestic researchers to design specialized parser for sentence contains multiple verbs. Aiming at the above problems started work the following aspects:Firstly, this paper described the Concept Compound Chunk on the conceptual level then followed the analysis needs of Chinese sentences currently, after that described the multiverb contents in the Concept Compound Chunk. This paper analysis the standard of Concept Compound Chunk, and then proposed a method for syntactic annotation Treebank standardization research. The accuracy of syntax Treebank is closely related with the followup training model library, rules and data extraction process. Therefore, this article first artificial label syntax tree library observation and statistical analysis, design a standardized approach. With this method, this paper amend and delete some errors in syntax tree library, improving the reliability of Treebank and prepare for the construction of training data.Secondly, this paper analysis the type of error in multi-verb sentences in the autoparsing progress of Concept Compound Chunk. And then propose a method for hierarchical classification of verbs. For the syntactic tree library has been normalized, by analyzing the sentence contains more than one verb characteristics, as well as labeling error types. We can know that the sentences which contain more than one verb, usually contain errors caused by inaccurate sentence analysis. In this paper using statistical methods, firstly extract the sentence contains more than one verb, then analysis verbs constitute rules, designed a method of analysis: classify the verbs level in the sentences and as a result of further analysis input. Experiments show that this experiment method we use can play a better role in the follow-up analyzer.Finally, this paper presents a method for automated analysis of multiple verbs. Through the analysis of the sentences that contains multi-verbs, and the classification process of verbs sentences that in line with rules. After obtaining the verb hierarchy, using the “Shift-reduce” chunking method to analysis the overall sentence, and add the reduction judgment in the original “Shift-reduce” end condition: if the verbs that belong to the same group and do not reduction to same chunk, then continue to the same chunk reduction. For that do not belong to the same group but reduction to the same chunk, do not take action. In terms of accession to the Statute after chunk, contains some unable to determine the relationship marked part, the use of the label prediction methods for processing, end up with a complete analysis of sentences. Experimental results show that this method of analysis process includes a plurality of verbs, the analysis results more general analyzer is good, the sentence verb ingredient processing is more accurate, so as to enhance the overall effect of the complex sentence analyzer process.
Keywords/Search Tags:tree parsing libraries, libraries standardized labeling, verb hierarchical classification, "Shift-reduce" analysis, tag prediction
PDF Full Text Request
Related items