Font Size: a A A

The Study On Chinese Text Segmentation

Posted on:2004-06-18Degree:MasterType:Thesis
Country:ChinaCandidate:L J XuFull Text:PDF
GTID:2155360092493724Subject:Chinese Philology
Abstract/Summary:PDF Full Text Request
Chinese text segmentation is an important question of Chinese information processing, and the settlement will directly influence the development foreground of Chinese information processing. And now the most widely used method is automatic segmentation. But this method can't solve the problem thoroughly, because this method can't solve the problem of ambiguous segment. So there are how many ambiguous segments, and there are how many kinds of ambiguous segments, and what is the reason. In order to have an all-round look of this question, we make a quantitative analysis, and study the form and formation reason of ambiguous segment.This paper comprises six parts: Part One: the background and the problem of the studyChinese text segmentation has the important function in the Chinese information processing. Chinese information processing must solve Chinese text segmentation. Part Two: Question and present situation of the studyNow we use computer to solve Chinese text segmentation, and also have made some achievements. But there is an important problem of automatic segmentation that this method can't solve ambiguous segment.Part Three: Content and method of our studyIn order to have an all-round look to ambiguous segment, we use the method of controlled language, and take 3,755 one grade of words in "Chinese Character Coding Character Collection of Information Switching - Basically Collection " as the research object, takeing " the Modern Chinese Word in Common Use Table for the Information Processing " as the reference, using the modern grammar, and gets the statistics, the form and the reason of the ambiguous segment. Part Four: The analysis of ambiguous segment in the automatic segmentationAt this part, we use the computer program processing to get the words present situation and the data of the ambiguous segment. And we make a study of the form, grammar relation and the reason of the ambiguous segment.Part seven: the forecast of the method to solve Chinese text segmentationBy way of above study, we know that we can't solve ambiguous segment thoroughly by automatic segmentation. In order to solve the problem thoroughly, we may study this problem from the method that write the Chinese text word by word or write according to the Scheme of Chinese Phonetic Alphabet on the study of the Technology of computer.
Keywords/Search Tags:Chinese information processing, Chinese text segmentation, automatic segmentation, ambiguous segment
PDF Full Text Request
Related items