Font Size: a A A

The Research And Implementation Of The Tibetan Textual Automatic Classification Based On The Web

Posted on:2013-12-02Degree:MasterType:Thesis
Country:ChinaCandidate:X Q Z YeFull Text:PDF
GTID:2235330374499669Subject:Chinese Ethnic Language and Literature
Abstract/Summary:PDF Full Text Request
With the further development of the Internet technology and the continuouslyimprove of the computer application technology in Tibetan areas, more and moreTibetan webs can be found in the Internet. To classify the Tibetan webs based on thecontents in different Internet domains automatically and provide the efficient andaccurate information service need a large number of human resources. Thus it isimpractical to use the traditional classification methods. Using the traditionalclassification methods cannot implement neither the dimensionality reduction nor theelimination of redundant information. Moreover, its classification accuracy on the webclassification is rather low. Instead, if the computer can classify the Tibetan websautomatically, it can not only reduce the investment of human resources and thefinancial resources, but also can improve the efficiency and the accuracy of the webclassification. Therefore, web users can readily find out the exact web. The study on theautomatic classification of Tibetan webs has a broad application prospect and a veryimportant practical significance on Tibetan search engine, Tibetan digital library, thedevelopment of Tibetan corpus and Tibetan publication etc..This topic studies theTibetan textual automatic classification, which aims at improving the efficiency of theinformation retrieval. Thus the web classification has become a key technology with agreat deal of practice value. It has been the common method on disposing the giant webinformation and the important research content of data mining and text mining.Tibetan webs automatic classification contains the preprocess of the Tibetan webs,the classification of Tibetan words, the characteristic selection, the weight calculationand the classification calculation methods etc.. This paper mainly studies the Tibetanwebs automatic classification, especially the classification of Tibetan words, thecharacteristic selection and the classification calculation methods. Besides, theclassification calculation method both on Chinese and English is used for reference. Theclassification calculation method which meets the Tibetan grammar and the textstructure characteristics is put forward. Then we analyze the classification of Tibetanwords calculation method and the classification calculation method in an experimentalway.
Keywords/Search Tags:the Tibetan web, the automatic classification, the classification of Tibetanwords, the characteristic selection, the classification calculation method
PDF Full Text Request
Related items