| With the rapid development of Internet,the network will produce massive daily text data,how to extract useful information from these massive text data become the hotspot of today’s research.Text classification is an important part of data mining technology,which facilitates the efficient storage and mining of massive text information.Therefore,research has important value and significance.Firstly,based on the study of the general text classification model,a text classification model based on formal concept analysis is proposed for the current text classifier in the case of less training text set.The model divides the attribute characteristics of the text,forms the background of the form,constructs the concept lattice,and classifies the classification rules extracted from each concept in the concept lattice as the rules of text classification.Secondly,for the algorithm of concept lattice classification rule extraction,this paper presents an improved algorithm for extracting classification rules.The algorithm calculates the weight of each attribute in each classification rule,and converts the extracted classification rules into the sum of the weights of the attributes.The algorithm can extract more classification rules,can be better to avoid the classification of the classification rules are too few and can’t determine the situation.In addition,in the prediction,the determination of the sum of the attribute weights is more convenient than the previous classification rule,and can effectively reduce the spatial and temporal complexity of the judgment.Finally,this paper uses the method of chi-square verification as feature selection method in text preprocessing,and combines the text categorization model given in this paper to develop text categorization software based on formal concept analysis.In the demonstration model construction process,as the experimental platform,the use of open data sets: the calculation of precision,recall and F value of the three indicators,conducted a number of experimental comparison.The experimental results show that,in the case of relatively small text training set,the proposed model can also get a better classification effect,compared with the traditional text classifier due to over-fitting caused by poor classification of the situation has improved significantly. |