Font Size: a A A

Identification Research Of Modern Chinese Frequency Suffix Derivative

Posted on:2011-11-12Degree:MasterType:Thesis
Country:ChinaCandidate:F WuFull Text:PDF
GTID:2155360302992051Subject:Linguistics and Applied Linguistics
Abstract/Summary:PDF Full Text Request
Automatic segmentation is a basic work in the field of Chinese information processing. Segmentation of unknown words is one of the important factors which impact the accuracy of automatic segmentation. Derivations accounted for majority of new words in the unknown words, it has an important significance of the identification of derivations for improve the segmentation accuracy and further automated analysis of Chinese syntax. On the purpose of information processing, this paper select the suffix"zi","tou"and"zhe"as the objects of study , which based on the study of large-scale text corpus. Firstly, discuss the inside word-formation rules of derivations and boundary features, then designing the algorithm with matching information and Co-occurrence frequency. This paper is divided into four chapters:Introduction section: first introduces the significance of study, defines the objects of study and summarize the current research on this project, then introduces the research methods for this paper and explains the source of the corpus that we use.Chapter one: introduces the notion and features of Chinese segmentation and ambiguity, then explains the overall pattern of lexical analysis and the formalized representation of rules.Chapter two and Chapter three: mainly discuss the syllable and part of speech limiting when suffix"zi"and"tou"form new words from the internal structure of derivations, and the boundary features of derivations which including"zi"and"tou".Chapter four: mainly discuss the syllable and part of speech limiting when suffix"zhe"form new words from the internal structure of derivations, and the analysis of relevance factor of"zhe".Chapter five: on the basis of the findings in previous chapters, first establishes words list and rule base, proposes the general ideas, steps and methods of derivations'identification; then for algorithm designing, and analyzes the knotty problems.Chapter six: first summarizes the conclusions of this study, then proposes the insufficient and problems that need to be solved. At last, make a possibly outlook for future study.
Keywords/Search Tags:suffix, "zi", "tou", "zhe", derivations, identification
PDF Full Text Request
Related items