Font Size: a A A

Syntactic Tagging On Modern Chinese Special Sentence Patterns Based On Information Dependency Language Model

Posted on:2013-02-22Degree:MasterType:Thesis
Country:ChinaCandidate:L WuFull Text:PDF
GTID:2235330362473783Subject:Foreign Linguistics and Applied Linguistics
Abstract/Summary:PDF Full Text Request
Corpus is an important tool for language study, and it possesses great applicationvalue. The focus of corpus construction is to promote the annotation quality of corpora.How to perform an in-depth annotating process to the corpus has become a hot and keytopic in corpus linguistics. Right now, the annotation of most corpora is confined to thelexical level. The scale of syntactic tagging is very limited. Traditional syntactic taggingtheories such as Phrase Structure Grammar (PSG) and Dependency Grammar (DG)cannot be put into practice of large-scale corpus annotation, due to their respectivedefects. Information Dependency Language Model (IDLM) was put forward by LiLiangyan (2009). It is a syntactic tagging theory catering to corpus construction. Thistheory integrates the advantages of PSG and DG and performs a better descriptivecapacity over the previous two. It also borrows the conceptual autonomy and conceptualdependency principle from Cognitive Grammar (CG) and values the description andexplanation of language. IDLM puts the syntactic and semantic analyses together, is arelatively new and promising annotating theory. Before being put into large-scale corpusconstruction, IDLM theory should first be put into the analysis of difficult points andpopular topics of languages, to testify the applicability of the theory. And this thesisaims at creating a set of annotation norms in order to lay a firm foundation forlarge-scale corpus annotation.There are large quantities of special sentence patterns in modern Chinese. Theyform a difficult part in language study. Traditional researches study the special sentencepatterns from a pure linguistic point of view, and different linguists often have verydifferent opinions on the same problem. So far there are very few studies that analyzeand formalize modern Chinese special sentence patterns from a viewpoint of corpusconstruction. The author chooses4typical kinds of special sentence patterns based onthe previous studies as the research object of this thesis, and provides evidence alongwith the formalized analyzing results on the basis of IDLM. The research purpose ofthis thesis is on the one hand to verify the applicability of IDLM to modern Chinese,and on the other, to accomplish the annotation of these special sentence patterns. Thisthesis is a pre-study of the application of IDLM in corpus construction.This thesis consists of five chapters. Chapter one is an introduction which presentsthe general background of this study, including research contents, contents arrangement, methodology, research objectives and research significance. Chapter two is literaturereview, including a clea-up and a sum-up of the research results in the field of corpusannotation and the field of linguistics. Chapter three is the theoretical foundation of thisthesis. It contains a detailed introduction of the mechanism of IDLM and a completeanalyzing process of its application in general sentence analysis. Chapter four is IDLManalysis of the4special sentence patterns. Analyzing expressions and graphs are alllisted, and further syntactic and semantic explanation of those annotating results arecarried out in this chapter. Chapter five is conclusion.There are several innovative points of this thesis. Firstly, based on a careful studyof the previous research results, this thesis chooses IDLM as its theoretical foundationfor the purpose of corpus construction, which is different from the traditional languagestudy and the traditional corpus annotation. Secondly, IDLM analysis of the specialsentence patterns deepens the application of IDLM in language analysis. Thirdly, in theprocess of the study, the author finds that IDLM annotation results of the specialsentence patterns are not concise enough, so a referent form is employed to shorten theexpression, which is an improvement for the theory.The study finds that, in IDLM, relations between tagmemes and within eachtagmeme is presented as multiple structures, which connect with each other by way ofjointing and nesting. IDLM analysis of language keeps in line with the cognitive logicand the behavior experience of human beings, which promotes natural languagelearning and language information extraction of the computer. The birth of IDLMbreaks the standstill situation of corpus annotation and provides a new way for thedevelopment of it. It is of great practical significance.In conclusion, IDLM realizes the formalization of syntactic structure with a seriesof concise information dependency expressions and clear information dependencygraphs. The analyzing results match the original meaning of the sentences. And IDLMperforms a strong capacity in language description, and makes a preparation for theconstruction of large-scale corpora. It is a useful supplement of traditional languagestudies.
Keywords/Search Tags:IDLM, modern Chinese, special sentence patterns, syntactic tagging, corpus construction
PDF Full Text Request
Related items