Font Size: a A A

Dimension Reduction Algorithm Of Tobacco Raw Material Index Data By Fusing Global And Local Discriminations

Posted on:2022-03-13Degree:MasterType:Thesis
Country:ChinaCandidate:Y WangFull Text:PDF
GTID:2481306731953249Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
The chemical composition of tobacco raw materials is an important indicator for tobacco quality appraisal.Due to the large variety contents of the complex chemical composition of tobacco raw materials,the data system of tobacco raw material indicators is characterized by high dimensionality and non-linear structure,which brings a challenge to the evaluation of tobacco quality.In this paper,we propose a Kernel Global-local Marginal Discriminant Analysis(KGLMDA)algorithm based on the fusion of global and local discriminations and a Weighted Maximum Class Boundary Criterion(WMCBC)algorithm to solve this technical challenge by effectively reducing the dimensionality of the indicator data while maintaining the original primary information of the tobacco raw material indicators.The main research elements of this paper are as follows.1)By analyzing and practicing the effect of mainstream dimensionality reduction algorithms on tobacco raw material index data,it is concluded that supervised dimensionality reduction and non-linear dimensionality reduction are more suitable for the dimensionality reduction of tobacco raw material chemical index data system.To address the problem that traditional dimensionality reduction algorithms cannot effectively reduce and extract features from the tobacco raw material chemical index data system,a KGLMDA algorithm is proposed,which introduces the idea of kernel transformation to project the data to a higher dimensional space for analysis to solve the problem of overlapping and intersection indistinguishability of tobacco raw material index data.The algorithm introduces two regularization terms,R1 and R2,to describe the global and local structure information of tobacco chemical index data,so as to carry out feature extraction and extract more effective global and local feature information from tobacco data,making the classification results more accurate and reliable.2)To address the challenge that existing algorithms can't effectively differentiate data points between classes of tobacco chemical index data,a weighted maximum class boundary criterion algorithm(WMCBC)is proposed.The supervised algorithm makes full use of the category discriminative information of the labels and optimises the intra-class scatter matrix so that similar data samples are as close as possible in the feature space;by adding the weight function,the projection direction is optimised and the By adding the weight function,the projection d irection is optimised and the problem of cross-over between different classes of data is solved,effectively realising the descending and rank classification of the tobacco raw material chemical index data system.In this paper,experimental validation and analysis of the KGLMDA algorithm and WMCBC algorithm are conducted in conjunction with k NN classification techniques to compare the classification effects of the PCA,KPCA,LDA,LFDA and MFA algorithms with the proposed KGLMDA and WMCBC algorithms for tobacco raw material index data after dimensionality reduction processing,comparing the tobacco data after processing by each algorithm by analysing the measures of dimensionality reduction effects The degree of separation of different tobacco grades was compared by analysing the measures of dimensionality reduction effect,the global and local information extraction ability of each algorithm was compared under different proportions of training sets,and the noise resistance of each algorithm was compared under the addition of random noise.Through a five-fold cross-validation,the KGLMDA algorithm achieved 93.33%,90.96% and 95.21% accuracy for tobacco leaf grades B2 F,C3F and X2 F,respectively.Experimental results show that the proposed KGLMDA algorithm and WMCBC algorithm have a good nonlinear data dimensionality reduction capability,a good global and local information extraction capability and anti-noise capability.It can be used to classify the index data of tobacco raw materials into lower and higher grades,and it is more accurate and robust than other algorithms.
Keywords/Search Tags:Tobacco chemical index, Data dimension reduction, Biregularization factor, Marginal Fisher analysis, Maximum Class Boundary Criterion
PDF Full Text Request
Related items