Font Size: a A A

Study Of Remote Sensing Image Classification Based On Machine Learning

Posted on:2015-12-24Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y ZhangFull Text:PDF
GTID:1223330467957566Subject:Forest management
Abstract/Summary:PDF Full Text Request
It is a long lasting problem of remote sensing science to extract useful information from remote sensing data for a long time. Remote sensing image classification is one of the most basic problems in remote sensing image information processing, as well as the key technology of remote sensing application system. The performance of classification methods directly affects the application and development of remote sensing technology. So many researchers have been exploring new ways to improve the precision and speed of the automatic remote sensing image classification algorithms.In the practical application of remote sensing image classification, a lot of training data are needed to obtain high accuracy of classification including a lot of manpower and material resources to labeling data, which is very time consuming and laborious. On the other hand, in the field of remote sensing, the adoption of modern high resolution sensor technology makes it relatively easy economy collect plenty of unlabeled data from the image data. How to take full advantage of ample unlabeled samples to improve the precision of remote sensing classification is the key point of this study with few training examples.Based on the theory of machine learning, this paper studied combination of ensemble learning, semi-supervised learning and active learning, as well as mining the most informative unlabeled samples in remote sensing image classification to enlarge the small labeled training dataset, to improve the accuracy of classification. The main work of paper is as follows:(1) The key point is diversity of base classifiers in multiple classifiers system. A new method, A&D, regarding accuracy and disagreement between two classifiers, is presented to measure the diversity among multiple classifiers. The multiple classifiers system (MCS) is built with selective ensemble classifiers in parallel. Through the comparative analysis with Bagging and Adaboost, the MCS method improved the performance and ensured the generalization of the remote sensing image classification in the case of small training samples. In addition, a new heuristic algorithm named Multivariant Optimazation Algorithm (MOA) is explored to solve the SVM parameter optimization problem. Compared with conventional grid search, genetic algorithm and particle swarm algorithm, the MOA is a preferable method in efficiency and performance of selection optimal parameters for SVM.(2) Ensemble of Semi-supervised co-training and multiple classifiers is explored. The methods of E_Self-training and Tri-MCS are presented in this paper, which combined the MCS with Self-training and Tri-Training during training classifier procedure. To the20%training samples, the accuracy of E_Self-training, has a0.849%increase in that of Self-training, and Tri-MCS has a2.395%increase in that of Tri-J48. Based on kNN algorithm, the data edit method, named NE-NED is also presented to correct and remove the mistakenly labeled samples, which improved the quality of the training examples to boost the overall accuracy of classification. For examples, the accuracy is from91.43%to91.86%. Then, SSLRF (Semi-supervised Learning Random Forest) is presented, which put the construction of random forest into the procedure of co-training of Semi-supervised classification. For the remote sensing data of research area, it obtained94.36%overall accuracy, a0.5325%increase in that of the random forest.(3) Based on the theory of active learning, the paper focused on the active sampling strategy. The Random Sampling, Simple Disagreement Sampling (SDS), Entropy Priority Sampling (EPS) and multi-class SVM sampling models are presented. To the20%training samples, the performances of the SDS and EPS have1.34%and1.33%increases in that of decision tree J48, respectively. For the same dataset, the single classifier, SVM, with accuracy of91.31%, the presented methods SVMEPS and SVMAL achieve the accuracy of92.93%and92.46%. The experimental results show that the active learning sampling methods are superior to traditional supervised classifier, which enhance the classification performance under few labeled training samples. The model of multi-class SVM can solve the problem of classification accuracy for some classes, whose training examples are difficult to acquire.(4) With the complementarity of Semi-supervised learning and active learning, how to combine them effectively to enhance classifier accuracy was carried out. The paper put forward the SDS and EPS sampling method in active learning, and merged then into the each iteration of tri-training process. The ensemble model Tri-SDS and Tri-EPS, the accuracy of93.92%and93.57%, outperform that of Tri-Training, SDS and EPS. The SemiAL model, the union of semi-supervised and active learning, is also proposed, computing confidence from vote entropy and neighborhood similarity, making full use of the unlabeled samples from different point of view of information to improve the generalization of learning. Based on SVM classifier, the presented methods SemiALEPS and SemiALCR obtain the performances with accuracy of91.97%and92.55%, that of91.31%for classifier SVM.(5) In this paper, on the basis of construction multiple views, three ways including Random Split Feature, Spectral Texture Split, and Variable Importance Split, are put forward to build two-view from a single view remote sensing image with spectral and texture features. Based on active learning and semi-supervised active learning the multi-view classification models including MV-SDS, MV-EPS and MV-SemiAL are proposed. The experimental results on actual remote sensing image, the classification results are analyzed under different classification models with three ways of building views. In the way of Spectral Texture Split, the methods of MV-SDS, MV-EPS and MV-SemiAL have4.11%,3.87%and4.36%increase, respectively, in that of single classifier SVM with accuracy of89.75%. The experimental results show that multiple views of active learning and semi-supervised learning can achieve better classification performances with few labeled samples.(6) According to the grid computing features and basic theory, this paper designed and built a platform of remote sensing image classification parallel processing, which supports dynamically extending all kinds of services and the nodes. The remote sensing classification tasks are deployed on the remote nodes. The server of the grid completed the task decomposition, the scheduled sub-nodes to complete the assigned tasks. The results of experiment show that, for the remote sensing image with932M size of research area, the speed rate of classifying the sub-remote sensing image paralleled with six nodes is3.2, as well as the speed efficiency of0.53. The platform enhances the efficiency of large remote sensing image classification in parallel architecture.This paper puts forward some models and methods based on integrating ensemble, semi-supervised learning and active learning. Those take full advantage of ample unlabeled samples to expand training set, in order to improve the classification performance of supervised classification and generalization. The work of the paper provides an effective way to improve the performance of remote sensing image classification in the case of few labeled examples.
Keywords/Search Tags:Ensemble learning, Semi-supervised learning, Active learning, Multi-view classification, Sampling strategy
PDF Full Text Request
Related items