Font Size: a A A

A Contrastive Study On The Part Of Speech Tagging Of Dictionary Of Contemporary Chinese And Grammatical Knowlege-base Dictionary

Posted on:2017-09-11Degree:MasterType:Thesis
Country:ChinaCandidate:H ZhaoFull Text:PDF
GTID:2335330512452989Subject:Linguistics and Applied Linguistics
Abstract/Summary:PDF Full Text Request
Problem of parts of speech has always been a key,difficult and hot issue in the study of Chinese language,Chinese have parts of speech,these problems can be divided into parts of speech and how to divide parts of speech after linguists of discussion and research,has been basically solved and achieve consensus: Chinese parts of speech and can according to the function division of parts of speech.The problem of part of speech tagging is also of Chinese information processing,modern Chinese grammar and vocabulary of mutual concern,therefore,linguists have put forward many Chinese parts of speech marking system,also because of this,some words to each other there are large difference between the parts of speech classification,and so far no one part of speech tagging results of large-scale systems.So this article in "modern Chinese dictionary"(fifth edition)and "modern Chinese grammatical information dictionary" on the part of speech tagging results of the two large dictionary object,based on the part of speech of the corresponding algorithm,automatically find out two of the same word in the dictionary on the part of speech tagging,and then analyze the reason for the difference between the formation,and part of speech tagging has issued their own points of view.The main content of this paper are arranged as following:This article is divided into five parts.The first part is introduction,mainly elaborated the related studies of modern Chinese parts of speech classification and current about two dictionary research status quo of part of speech tagging and contrast,for the purpose of this study,research methods and innovations.The second part introduces the source of the corpus and data extraction.Based on the "modern Chinese dictionary"(fifth edition)and "modern Chinese grammatical information dictionary" as the corpus,the design program algorithm,labeled with part of speech of the words in the two dictionaries using computer to extract contrast,parts of speech corresponding results: idiom,contour,and several corresponding equivalent classes and corresponding equivalent class four of them all.The aim of this paper is to analyze two after.The third part is the equivalence of the equivalent corresponding to classify words research.The class is divided into four categories: covers the corresponding class,migration corresponding type,standard type corresponding classes and pseudo corresponding class,and the four small class the following differences of parts of speech classification comparison analysis.The fourth part mainly range analysis of equivalence classes.These words mainly in addition to the total of parts of speech,both with inconsistent word in the dictionary.Section will be divided into three sections,the first is in "modern Chinese dictionary" study of empty words,the second section is in the dictionary of modern Chinese grammar information classifying empty words,the third section is not empty in two dictionary is also wrong words,such as a brief analysis.The fifth part is epilogue.The research results of this paper to do summary,marked differences in the two dictionaries of words and the reasons and summarize the final result.The innovation of this article lies in: one is the part of speech of words in two dictionary of contrast research,especially marked differences.Such articles are rare,let alone a systematic research.Currently only language scholars in the sixteenth session of international seminar of Chinese vocabulary semantics of essays published an article about "a preliminary comparison of two dictionary words classified results",this article from the Angle of knowledge with distinguishing different preliminary compared the two prepositions are words in the dictionary.Secondly,this article from the Angle of the micro each word even every word for specific analysis.Word segmentation part-of-speech tagging in the actual process,we pay more attention to is the difference between the two for the same word in the dictionary of part-of-speech tagging,this brought artificial tagging and proofreading,therefore this article from the micro perspective,the part of speech tagging differences exist in the two dictionaries of words,and the causes of the differences are explained,and further put forward their own views,hope this research can make a little contribution to lexicography and Chinese information processing.
Keywords/Search Tags:Dictionary of Comtemporary Chinese, Grammatical Knowlege-Base Dictionary, Part-of-speech Annotation, Part-of-speech Correspondence
PDF Full Text Request
Related items