Bayes-based Classification Algorithm For Web Text

Posted on:2005-01-22

Degree:Master

Type:Thesis

Country:China

Candidate:H Zhang

Full Text:PDF

GTID:2168360152969232

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

With the flood of information on the Web, Web mining is a new research issue which draws great interest from many communities. Currently there are many algorithms about Web mining. Simple Bayes is a good algorithm of them. It needs many training documents to make classifier. So how to improve accuracy and decline the numbers of training documents is very important for simple Bayes-based classifier.Text mining includes words processing and text classifying method. English text is made as the classified sample in order to decline the complex of the word stemming,since Chinese text need much work on word proceeding. And an improved Document Frequency method is made the standard of the character vector. In the classifier, there are several methods, which include the method of probability and iteration. It gives some words which are regarded as the latent label before the classifying. Then, the document is labeled with the label which has the maxim post-probability in every classification. The final label is made up of the all labels that are got from iterations.This classifier is a simple Bayes-based classifier for Web text. It uses methods of simple Bayesian model and latent word analyses. It decreases the complexity of Bayesian net, and it improves the accuracy of classifier and decline the number of training number. So it is a good text classifier by the experiments. And it should be tested for Chinese text in next state study.

Keywords/Search Tags:

data mining, text classification, simple Bayesian, iteration

PDF Full Text Request

Related items

1	The Application Of Text Categorization In Short Message Filtering
2	Background Learning Based Iterative Framework For Text Classification
3	Data Mining Research In Web Information Retrieval And Classification
4	Research On The Approach Of Classification In Data Mining Based On Naive Bayesian
5	The Research And Implementation Of Bayesian Classification Algorithm In The Text Based On Spark Platform
6	Research On Selective Bayesian Classifiers
7	Semi-simple Bayesian Classifier Research
8	Design And Implementation Of Data Mining Classification System Based On Hadoop
9	Design And Implementation Of A Text Classification System Based On KNN Algorithm
10	Na(?)ve Bayesian-based Automatic Webpage Classification Technology Research