| Remote sensing information extraction technology plays an important role in all aspects of national life.However,in most applications,the data-sets are imbalanced.In particular,with the continuous advancement of urbanization and the intervention of human activities,the urban remote sensing data sets have shown an increasingly severe imbalance.However,most information extraction technologies are affected by imbalanced data-sets,which often form inappropriate biases on the minority classes,and even cause the failure of prediction on the minority classes,thereby limiting the accurate acquisition of the information.Therefore,it is urgent to improve the recognition accuracy of the minority class(es)while ensuring the recognition performance of other classes in the classification of imbalanced urban remote sensing areas.In particular,in the context of big data,it is especially important to use limited data to strengthen the understanding of the original remote sensing imbalanced data,which is of great significance for rapid response and reasonable decision-making.This thesis aims at the imbalanced classification of remote sensing in urban areas.Based on the in-depth analysis and summary of the current status of imbalanced classification researches,this thesis addresses the lack of researches on the single model performance to imbalanced data sets,learning strategies,data space transformation,and evaluation assessments.In terms of algorithms and learning strategies,e Xtreme gradient boosting(XGB)and semi-supervised learning are introduced,and an impartial semi-supervised learning strategy with XGB(ISS-XGB)is proposed on the premise of no extract data labeling cost for remote sensing imbalanced classification.Moreover,the internal characteristics of the ISS strategy on the impact of remote sensing imbalance classification are deeply investigated.Additionally,a three-level evaluation system of accuracy,disagreement performance and decision margin are established.The main work and research conclusions of this thesis are as follows:(1)At the algorithm level,the XGB model is introduced and compared with typical remote sensing classification methods to analyze the classification response to different data complexity and sample imbalance.The experiments on 8 experimental areas show that: 1)XGB performs better than typical classification methods such as random forest,multi-layer perceptron,and support vector machine in the imbalanced classification of VHR images.The confidence of correct classification output by XGB is higher.However,when the sample is extremely imbalanced,the model performance is also affected by the imbalanced data distribution;2)When the extent of sample imbalance reaches a certain level,the performance uncertainty of XGB is not sensitive to the spectral separability of the land-cover types;3)Through the analysis of multiple accuracy measures,the three-level evaluation method,which considers accuracy,disagreement performance and decision margin,can analyze the model’s response to imbalanced data sets from multiple perspectives.Thus,XGB is an effective method for remote sensing imbalance classification of land cover from VHR images,but it is still affected by the distribution of sample data,especially when the sample is extremely imbalanced.(2)From the aspect of learning strategy,this thesis proposes an impartial semi-supervised learning method(ISS-XGB).ISS-XGB establishes impartial and equal training scheme by eliminating the skewed distribution of training without increasing the labeling cost and sample requirement.This thesis compares the typical semi-supervised learning methods(PU-BP and PU-SVM)and the typical data-level typical method(SMOTE)commonly used in imbalanced learning in the remote sensing field with ISS-XGB.The results show that: 1)ISS-XGB can effectively identify the minority class without losing system performance;2)The optimization of learning strategies and simulators in ISS-XGB help to learn the decision boundary of the minority class more equally and comprehensively.Compared with PU-BP and PU-SVM,the overall accuracy and the F1 of the minority classes are about 20% and 15% higher than the previous two approaches.Although the SMOTE based on classification methods can achieve close accuracy performance with ISS-XGB,it is not as stable as the latter;3)The prediction confidence of ISS-XGB is not as good as positive-positive mode learning scheme.(3)From the perspective of ensemble learning(the accuracy and diversity of baselines),this thesis proposes to use model perturbation(ISS-Hybird C)and parameter perturbation(ISS-Hybird P)to investigate the influence of different characteristics in ISS ensemble prediction systems for remote sensing under different weighted ensemble mechanisms.The results show that: 1)In ISS ensemble learning,the effect of the accuracy of the baselines are more significant than that of diversity on the generalization ability;2)Homomorphic ensemble of ISS-XGB with parameter disturbance can greatly improve the model MPM performance(53.85% higher than that of ISS-XGB)while ensuring accuracy and disagreement performance at the same time;3)Different weighted ensemble mechanisms(MSE,Fpb,MPM)can improve the volatility of accuracy and disagreement performance of ISS-Hybird C and ISS-Hybird P.Thus,while ensuring the accuracy of the baselines,and increasing the diversity of the outputs in ISS ensemble system with weighted ensemble mechanisms can improve the output of the ISS semi-supervised ensemble prediction system on different performance perspectives.(4)From the perspective of problem domain,this thesis uses Object Based Image Analysis(OBIA)to transform the problem domain of imbalanced remote sensing classification,and studies the performance of imbalanced remote sensing classification with datasets of changing overall distributions.The results show that: 1)OBIA can improve the absolute accuracy of the models(the maximum overall accuracy of ISS-XGB can reach 98.13%,and the maximum F1 of the minority class can reach about 0.95),but the influence of skewed data distribution still maintains and is similar to that by the pixel-based methods;2)the division of objects helps to improve the model MPM output,of which ISS-XGB method is nearly 128.98% higher than pixel-based performance.Thus,although OBIA can improve the homogeneity of class features and reduce the difficulty of classification and recognition,remote sensing classification and applications based on OBIA are still challenged by imbalanced data-sets. |