Font Size: a A A

Comparative Study On Formalized Discourse Structure Theories

Posted on:2011-09-27Degree:MasterType:Thesis
Country:ChinaCandidate:X P NiuFull Text:PDF
GTID:2155360308958310Subject:Foreign Linguistics and Applied Linguistics
Abstract/Summary:PDF Full Text Request
With the coming of computers era, corpus has become a powerful tool in the fields of linguistics, language teaching, natural language processing, etc. It has developed into a new course: corpus linguistics. Corpus annotation is crucial to corpus linguistics and determines the quality of corpus construction. Discourse structure annotation serves as a significant part succeeding syntax tagging. So far, some corpus projects, at home and abroad, have had a considerable scale in syntax tagging, but it is still not so common in discourse structure annotation. The reason lies in that the theoretical foundation of discourse structure annotation: formalized discourse structure theory has not been mature enough. Since 1980s, discourse structure has become the hotspot in linguistic research and some discourse structure theories have been put forward at home and abroad and more or less, these theories have formalized features. From the perspective of corpus linguistics and setting discourse structure annotation as its goal, this study makes a comparative study on the formalized discourse structure theories both in theory and in annotating practice, hoping to provide some reference for the establishing of mature formalized discourse structure theory.This thesis is composed of five chapters. Chapter One is Introduction. It presents the research motivation, research questions and significance of the study and the outline of the dissertation. Chapter Two is Literature Review. It defines the relative terms and introduces the current situations of four formalized discourse structure theories: Rhetorical Structure Theory (RST), Discourse Representation Theory (DRT), Segmented Discourse Representation Theory (SDRT) and Information Dependency Model (IDM). Chapter Three introduces the analyzing procedures and techniques of the four formalized discourse structure theories. RST and IDM represent discourse structure in the form of tree or graph with corresponding software supported. DRT and SDRT represent discourse structure by use of logical expressions and frames, not having been supported by corresponding software. Chapter Four makes a comparative study on the formalized discourse structure theories both in theory and in annotating practice. This section examines the four theories based on General Principles for Corpus Annotation and discourse coherence. The conclusions made on the basis of theoretical analysis would be checked by the annotating results of a specific discourse with the four formalized discourse structure theories. Chapter Five is Conclusion. The research results show that:1) In terms of Separability and Operability of annotation, RST and IDM perform pretty well and they are suitable to guide discourse structure annotation while DRT and SDRT perform badly. Furthermore, IDM performs better than RST in precision when describing semantic relations.2) In terms of representation of discourse coherence, SDRT and IDM can represent the discourse structure more thoroughly from both macrostructure and microstructure; RST focuses on macrostructure and DRT pays more attention to microstructure.Comprehensively speaking, IDM is the most suitable to work as the theoretical foundation of discourse structure annotation.
Keywords/Search Tags:discourse structure, formalization, corpus annotation
PDF Full Text Request
Related items