| In order to make reasonable use of woodland soil resources and make the planted crops compatible with woodland soil types,scientific and effective analysis of woodland soil elements is essential.Therefore,this paper analyzes Guangxi woodland soil using the Guangxi woodland soil element dataset provided by Guangxi Academy of Forestry,and firstly preprocesses the woodland soil element dataset using the improved multi-label feature selection algorithm to filter out the important features for woodland soil analysis,then classifies the soil from four aspects of fertility,category,texture and acidity using the improved ML-KNN algorithm,and finally on Based on this,a multi-label-based elemental analysis system for forest soils was implemented.After summarizing,the main research contents of this paper are as follows:(1)To address the problem that the existing multi-label feature selection algorithm does not consider that features with low correlation with labels may also be important features and label weights,this paper proposes a subset-based multi-label feature selection algorithm(MFSWLS).The MFSWLS algorithm first calculates the weights of labels,then calculates the correlation between features and weighted labels and arranges them in descending order,and then divides the feature set into several Then the feature set is divided into several subsets,and the redundancy between features and labels is calculated and arranged in descending order in each subset respectively,and finally a certain proportion of features are taken in each subset to fuse the features of each subset to get a new set of feature sets.Through experiments,it is proved that the MFSWLS algorithm outperforms the other five multi-label feature selection algorithms in pre-processing the forest soil element dataset.(2)In response to the problem that the K-Means algorithm randomly selects the initial clustering centers,which leads to unstable clustering results,this paper proposes the K-Means algorithm to improve the initial clustering centers(ICCK-Means).ICCK-Means algorithm first calculates the correlation degree between each sample,eliminates the samples with small correlation degree,then selects the initial clustering centers,divides the corresponding class clusters according to the Euclidean distance Finally,the clustering centers and clusters are updated continuously until all the clustering centers are no longer changed.Through experiments,it is proved that the ICCK-Means algorithm combined with the improved MLKNN algorithm works better.(3)To address the problem that the ML-KNN algorithm does not consider the different distances between the nearest neighbor samples and other nearest neighbor samples and the samples to be tested resulting in different similarities,and the problem that the ML-KNN algorithm takes a long time to process large data sets,this paper proposes the ML-KNN algorithm based on ICCK-Means clustering(IWML-KNN).The IWML-KNN algorithm first uses the ICCK-Means algorithm to Means algorithm to cluster the training data set and the test data set,then count the prior probability and conditional posterior probability of the training samples,get the weights according to the distance between the training samples and the nearest neighbor samples,and finally calculate the posterior probability that the sample to be tested belongs to a certain label according to the maximum posterior probability principle and Bayesian formula,and then use the new classifier function to predict the set of labels of the sample to be tested.It is proved experimentally that the IWML-KNN algorithm outperforms other improved ML-KNN algorithms.Through the study of the above three points,this paper classified the test dataset samples of Guangxi woodland soil element dataset with an accuracy rate of 88.7%,and completed the design of the woodland soil element analysis system based on multi-labeling on this basis,realizing four modules of original dataset display,dataset display after feature selection,soil classification result display,and soil element analysis prediction display,and after verification The system can well distinguish the specific types of forest soil. |