Font Size: a A A

Semi-supervised Learning And Its Timeliness Assessment Applied To Classification In Remote Sensing

Posted on:2022-11-10Degree:DoctorType:Dissertation
Country:ChinaCandidate:X R ZhengFull Text:PDF
GTID:1482306773470964Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
With the successful launch of many remote sensing satellites covering the world,the surge in demand has led to the rapid development and application of various complex models in remote sensing classification tasks.Remote sensing classification technology has higher and higher quality requirements for large-scale labeled datasets.On the one hand,in terms of quantity,it is not easy to obtain a large amount of labeled remote sensing data,which usually requires many professional data annotators to conduct large-scale field surveys and long-term industry mapping to obtain matching labels.On the other hand,qualitatively,due to the uneven distribution of natural objects,during the sampling process,the sample size of each category is usually unequal,resulting in unbalanced samples among remote sensing datasets,and even huge differences.The improvement of the quality of labeled datasets in remote sensing classification usually consumes manpower,material resources,financial resources and time,and the lack of dataset quality often causes the following problems:(1)The lack of large-scale labeled data makes the model difficult to fit;(2)The lack of labeled data in the minority class/imbalance of data between classes leads to the problem of model bias.In recent years,semi-supervised learning has received extensive attention,which can effectively improve the classification accuracy of the model in the case of insufficient dataset quality by exploiting the abundant natural features in a large amount of unlabeled data.However,in addition to processing a small amount of labeled data,semi-supervised learning also needs to continuously process a large amount of unlabeled data,which brings a large time cost.The high time cost of the algorithm often leads to the failure of the classification task,and the elements affecting the algorithm time are many and complex,and the time cost is extremely important but difficult to evaluate.This problem exists not only in semi-supervised learning but also among other algorithms,which is a common problem.Therefore,this study further raises related issues:(3)The time efficiency of the model lacks an effective time-efficiency evaluation framework.To sum up,this study starts from the perspective of accuracy and time efficiency,and specifically conducts research from the following three aspects:(1)A semi-supervised equalization method for imbalanced datasets is proposed.Aiming at the problem of lack of labeled data in the minority class/imbalance of data between classes,this study proposes a new semi-supervised learning-based solution,Near Pseudo.In Near Pseudo,the initial classifier is used to create corresponding pseudo-labels for unlabeled data,and the optimal unlabeled samples are selected according to the distance from the minority samples,and they are added to the imbalanced data set with the corresponding pseudo-labels.At the same time,through the feedback mechanism of consistency check,the pseudo-labels must be consistent with the minority class when adding unlabeled data,which further improves the quality of pseudo-labels.The experimental results show that Near Pseudo can improve the classification performance compared with other commonly used balancing methods.In addition,it can be flexibly applied to most commonly used classifiers to improve the ability of these classifiers to handle imbalanced datasets,while improving the classification accuracy of the minority class and reducing the bias of the model towards the minority class.(2)A semi-supervised fully convolutional neural network classification method is proposed.Aiming at the problem that the lack of large-scale labeled data makes the model difficult to fit,this study proposes a semi-supervised learning-based fully convolutional neural network ER-Net.ER-Net reconstructs a fully convolutional neural network with a consistent regularization term,so that unlabeled labels enter the model training as a regularization term.A signal-to-noise ratio data enhancement module is proposed to make the disturbance controllable.A short connection structure is added to the model.When the data is small in the early stage,the model can map itself,and when the unlabeled data is added to the training in the later stage,the model can obtain deep features.By building an exponential function to combine the consistency regularization part with the number of iterations,let the influence of unlabeled data on the model gradually increase with the number of iterations.The experimental results show that compared with other fully convolutional neural networks,ER-Net can well improve the model's performance with only 5 % labels and 95 % unlabeled data in two public datasets.The classification accuracy effectively utilizes the rich information of unlabeled data and reduces the time cost of manual labeling.(3)A time efficiency evaluation method is proposed.In view of the lack of an effective framework to evaluate the timeliness of the model,this study combed all the factors that may affect the running time of the algorithm(data size,number of categories,band characteristics,algorithm structure,algorithm parameters,etc.)in detail,and analyzed the relationship between the factors.A full-parameter time complexity framework FPTC is proposed to evaluate the timeliness of the algorithm.Under this framework,a clear qualitative relationship between each factor and the running time of the algorithm can be obtained through algorithm derivation.At the same time,this study proposes a coefficient to reflect the influence of computer environmental factors and physical factors on the running time.Through ,the quantitative relationship between FPTC and algorithm running time can be obtained.Through the cooperation of FPTC and ,the running time of the algorithm can be estimated before the program runs,and the change of the running time of the algorithm caused by the change of various factors can also be estimated.The experimental results show that there is a strong linear relationship between the actual running time and FPTC,and this linear relationship verifies the qualitative relationship between FPTC and running time.The research shows that the average root mean square error between the estimated running time and the actual running time by FPTC and coefficient is 2.365 s,which illustrates the feasibility and accuracy of FPTC and coefficient to predict the running time of the algorithm.In natural disaster emergency response,FPTC can quickly assist in screening efficient algorithms with reasonable running time.
Keywords/Search Tags:Semi-supervised learning, remote sensing image, semantic segmentation, timeliness assessment, remote sensing classification
PDF Full Text Request
Related items