Font Size: a A A

Application Research Of Stacking Algorithm In Ordered Data And Text Classification

Posted on:2022-11-15Degree:MasterType:Thesis
Country:ChinaCandidate:W Q WangFull Text:PDF
GTID:2518306782477554Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
Stacking algorithm is a common method to improve classifier performance.Generally,multiple primary classifications are trained;and then a strong classifier is trained by inputting the results of the primary classifier.The thesis explores the application of stacking algorithm when the output vari-ables are ordinal,and the input data are non-numerical variables such as text.(1)The thesis firstly explores the application of stacking algorithm in ordered data.Ordered data is discrete data with order relation.The conventional Stack-ing algorithm does not take into account the order relation between the ordered variables in the processing of ordered data classification tasks,resulting in reduced classification accuracy.Therefore,the thesis proposes the ordered stacking algorith-m by considering the order relationship of ordered data.First,the ordered data with8)categories is divided into8)-1 binary data sets in the order of categories,and8)-1 classification function1)4)()is trained.And then the random forest algorithm is used to construct the mapping function from the class vector of sampleto the real class of sample.The ordered stacking algorithm proposed in this paper is ap-plied to common classification algorithms,and it is found that the ordered stacking algorithm can significantly improve the performance of the classifier.(2)The thesis then explores the application of stacking algorithm in text classi-fication.Text classification is the process of category prediction of texts according to rules.Stacking algorithm requires multiple primary classifiers.However,with the development of pre-training model and deep learning,the primary classifier model parameters in text classification tasks are huge and stacking algorithm requires a lot of computational resources.Therefore,the thesis proposes IFGSR algorithm based on the idea of retraining in stacking algorithm.The IFGSR algorithm divides the text into multiple sub-texts and uses the existing classification rules to calculate the probability that the sub-text belongs to each category and the standard deviation,mean value and maximum value of the probability.Finally,the Softmax model is used to train the mapping function of these three statistics to the real category,and the mapping function is used to classify and identify the text to be predicted.Finally,the effectiveness of the IFGSR algorithm is verified by the paired sample T test.Based on the idea of Stacking algorithm,the thesis proposes ordered stacking algorithm and IFGSR algorithm.The idea of stacking algorithm is applied to or-dered data and text classification tasks,and the effectiveness of ordered Stacking algorithm and IFGSR algorithm is demonstrated by experiments.
Keywords/Search Tags:Deep learning, Machine learning, Stacking algorithm, Ordered data, Text Classification
PDF Full Text Request
Related items