Font Size: a A A

Semi-supervised Clustering And Its Application On Plant Leaf Image Recognition

Posted on:2018-11-07Degree:DoctorType:Dissertation
Country:ChinaCandidate:L L LiFull Text:PDF
GTID:1313330515450491Subject:Agricultural Electrification and Automation
Abstract/Summary:PDF Full Text Request
As a necessary part of modern agriculture,plant identification plays an important application role on the areas which are closely related to the fields of production and life for us,such as plant taxonomy,precise agriculture,horticulture and Chinese medicine research.Plant leaf is a kind of plant organ with flat two-dimensional structure,and its shape,margin and texture features has great morphological differences between each other.So the leaf features are usually utilized as a key index to distinguish plant species and their appearance,while an accurate and efficient image feature extraction and classification recognition algorithm is the key to solve the problem of classification for plant leaves.Recently,the feature selection and classification algorithms for plant leaves recognition have made certain progress on no matter the theory and the application.However,reaserches on relevant algorithms to identify leaf images with highly similar features research was rarely reported.And with the improving of the image acquisition technology,the resolution and dimensions of plant leaf image captured become higher.So the feature extraction from high-dimensional leaf image and the design of the classifier become the new problems to be solved for leaf classification.Image feature extraction and classifier design were used as a breakthrough point for our research.In a detail,semi-supervised fuzzy clustering with feature discrimination(SFFD)is utilized as classifier and the research is mainly focused on dimensionality reduction,semi-supervised fuzzy clustering algorithm and its application for leaf classification.In order to slove the key problems during the image recognition process for leaf classification,a novel dimensionality reduction algorithm,an optimization method for clustering parameters and SFFD were proposed.Based on the approaches above,a framework for leaf image classification was constructed and finally a comprehensive evaluation was performed with UCI datasets and measured datasets.The main work of this article is embodied in the following aspects:(1)This paper proposes a variant of PCA dimensionality reduction algorithm called L-PCA,which effectively reduced the feature dimension and improves the classification recognition rate.Based on the idea of global linear dimensionality reduction algorithm named PCA from classical convex clustering algorithm and LDA,an improved PCA method called L-PCA was introduced.The algorithm retained the covariance structure of the original samples,chose the most important principal component from transformation matrix for empowerment.By adjusting the discrete matrixes for inner-class and inter-class,the distances in the same class were minimized and the ones for iner-class were maximized to search for a suitable mapping subspace to seperate the data between into different categories.The results with artificial datasets and measured datasets implied that the mean generalization errors of L-PCA can achieve the goal of 11.94% according to 1-NN nearest neiborhood classifiers,and its average dimension reduction accuracy is 94.50%,the performance of target data continuity expression is 0.97.(2)By designing a variant of traditional FCM algorithm,a weight exponent optimization algorithm based on fuzzy separation degree named EOSD for the variant was proposed.Based on the fuzzy partition index and separation index,fuzzy separation degree was constructed and finally calculated.Then,to optimize the weight exponent,an inflexion point in the curve of fuzzy separation degree was located by observing the variance of separation curve with measured dataset and artificial datasets.The experiment shows that the optimization algorithm EOSD can help us choose the best weight exponent efficiently and the optimal value of m is between 1.8 and 2.2.So the optimal weight exponent for the family of FCM algorithm is 2.(3)In order to design a useful classifier for pattern classification,a semi-supervised fuzzy clustering algorithm with feature discrimination(SFFD)incorporating a fully adaptive distance function was proposed by taking pairwise constraints into account.To improve recognition capability,an effective feature enhancement procedure was applied to the entire data-set to obtain a single set of features or weights by weighting and discriminating the information provided by the user.Meanwhile,feature weights were used to modify the objective function of SFFD and then a comprehensive evaluation with accuracy and NMI was performed.Experiments on eight UCI datasets demonstrate that SFFD can effectively solve the common clustering problem and its performance is about 7.74% higher than the mean value of the rest algorithm.Moreover,the weight of SFFD can improve the classification accuracy for about 2.00% to 7.00%.(4)In order to solve the problem for determining the optimal clustering number of relative clustering algorithms effectively,analyse the function of weighting factor during the clustering process,we use different evaluation algorithm to evaluate the effectiveness of the SFFD algorithm and monitor the changing curve for weight from the partition matrix in the process.Based on SFFD,firstly four fuzzy clustering validity evaluation algorithms for clustering analysis,include PC,CE,SC and XB were introduced to evaluate the corresponding partitioning results,finally by make the comparative analysis of various validity evaluation algorithm with experimental data the optimal clustering number was obtained.Secondly,labeled data and prior knowledge were utilized to generate the pairwised constratints to guide the process of semi-supervised clustering.And then both UCI dataset and measured dataset called Leaf dataset were adopted to perform the cluster analysis.Eventually,the variation curve of weights v for input feature vector was obtained to analyze the effect of feature weights to clustering performance and partition result.The experiments tell us that making a good choice for the clustering validity evaluation algorithm can effectively handle the problem of the determination for the optimal clustering number by controlling the errors within 2 and the feature weights can separate the values of partition matrix clearly for less than 20 iterations.So feature weights based method is a simple and effective way to improveing the clustering performance.(5)We utilized semi-supervised fuzzy clustering algorithm named SFFD as classifier for leaf classification and then outlines a basic framework for leaf image recognition.The digital image for leaves were captured through on-the-spot collection method as the input dataset and multiple identification characteristics matrixes was extracted from the initial images.Then by use of the feature weighting procedure of SFFD classifier can greatly increase the clustering speed and effectively enhance the classification quality of the algorithm.Real leaf images belong to ten plant species were employed to evaluate its performance and the experiment demonstrates that the algorithm omits the step of sample training and the recognition accuracy for each feature with 30% side information could reach 72.40% to 86.46%.The recognition accuracy with single feature is 82.89%.Otherwise,under the same preprocessing algorithm and labeled data,the margin feature or the combination feature may be the best choice for leaf classification,then is the the shape feature.In summary,in order to solve the actual problem of leaf image classification,we combined dimensionality reduction idea with parameter optimization method,supervised clustering algorithm,cluster evaluation and analysis for its application together to propose several key approaches to deal with the problem above and these novel methods works well in practice.The experiments show us that whether the leaf classification algorithm works mainly depends on the classifier and the feature extraction algorithm.Appropriate side information can effectively improve the recognition accuracy and performance,and reasonable dimensionality reduction algorithm can greatly reduce the computational complexity of feature extraction.
Keywords/Search Tags:Semi-supervised clustering, leaf classification, dimensionality reduction, feature discrimination, clustering performance
PDF Full Text Request
Related items