Font Size: a A A

Research And Application Of Text Classification Algorithm For Chinese Information

Posted on:2017-03-03Degree:MasterType:Thesis
Country:ChinaCandidate:H HongFull Text:PDF
GTID:2308330485489207Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In today’s information as well as the rapid development of science and technology, the Internet is an indispensable part of the normal work and life, people also more and more inseparable from the Internet, and produce a lot of news on the Internet every day, pictures, video, and so on data linear growth. How to find their needs in multifarious information resources, and classify them in the grub, has become a popular in today’s research.This article is to research and application of Chinese information of text classification algorithm, first of all, to understand the classification algorithm, in this article, mainly introduces and implements several kinds of algorithms, a naive bayes classification algorithm, the K closest classification algorithm and support vector machine (SVM) classification algorithm, this algorithm due to the development of information and technology, there are a lot of predecessors on the research and improvement, but not suitable for all models, so in this paper, through research and analysis the mathematical principles of several kinds of algorithm, and then combine with in this paper, the environment, eventually making it several algorithms can be good apply to this article for information of text classification. Also increase the naive bayesian classification algorithm in association rules and the way of attributes and contracted to improve the classification accuracy, through experiments show that the algorithm in this paper the corresponding model, improved on the Chinese text classification accuracy.In this paper, the research of environment is a mobile phone APP called round orange the university entrance exam, the phone APP is to do as a parent or examinee college entrance examination of related consulting, schools can also see the cities in different years in the university entrance exam score, etc. And in which there is information on the bar, the bar is some relevant information, including enter oneself for an examination, obtain employment, study abroad and campus several parts. How to put a news or text information in accordance with the category assigned to the corresponding class, because there may be hundreds of thousands of data every day, if only by manual to classification, as the amount of work will be very big, it is almost impossible, so instead of using the algorithm of automatic classification of tedious manual operation, not only reduce the manpower, material resources, also can improve efficiency.
Keywords/Search Tags:bayesian classification algorithm, the K closest classification algorithm, support vector machine (SVM) algorithm, association rule, properties and contracted
PDF Full Text Request
Related items