| With the rapid development of remote sensing technology,getting remote sensing data becomes much easier.And the field of remote sensing has also attracted the attention of much scholars.Remote sensing image scene classification is an important part of remote sensing image analysis and interpretation,and it is also an important technology to realize the detection of the earth.In recent years,a large number of researchers domestic and foreign have focused on the task of remote sensing image scene classification and proposed a large number of methods to extract semantic information of remote sensing scene images as much as possible.However,the existing scene classification methods still have the following shortcomings:(1)Most of the existing classification methods are still based on the low-level local features extracted by traditional methods,while the low-level local features have limited ability to describe images.(2)In the existing classification methods based on deep features,most of them directly use the extracted features as the final image representation without normalizing,there may be some abnormal features affect the results.Moreover,the deep features can better describe the image content,but ignore the spatial distribution information of the image,and therefore also limit the final classification accuracy;(3)The classifiers used in the existing methods usually only consider a single classifier.However,the performance of a single classifier is limited,that is,each single classifier can only have a good classification effect on certain scene scenes,and cannot accurately classify all scene categories,and thus affect the classification effect;Aiming at the above problems,this paper proposes a remote sensing image scene classification method based on pre-trained neural network and proves the effectiveness of the method on three public datasets.First of all,in order to solve the problem that the existing descriptors cannot descript remote sensing images very well.This paper proposes a method that use the DenseNet structure as the feature extractor.The design idea of DenseNet draws on the Inception structure of GoogLeNet and the structure of ResNet,but in essence DenseNet is a brand new network structure which can effectively capture the scene information of remote sensing images.In addition,we further fine-tuned the pre-trained DenseNet based on the ImageNet dataset and strengthened the network’s robustness to geometric transformation through data enhancement to obtain a feature extractor that is more suitable for remote sensing images.Second,there is a lack of further processing for deep features.In this paper,spatial pyramid is used to further process the deep features of the image.By dividing the image features into several sub-regions,the spatial information of the image is captured by extracting the local features in each sub-region.In addition,in order to further improve the description ability of the feature,this paper uses the FV feature encoding method to aggregate the local features to obtain the final representation of the image.In addition,this paper constructs a simple weighted voting ensemble classifier to improve the classification result.Each base classifier of ensemble classifier is trained by AdaBoost.M1 algorithm,and according to the performance of each base classifier give them a according weight.Then,each base classifier is considered by weighted voting to enhance classifier performance.Finally,this paper conducted validation experiments and a variety of comparative experiments on three public standard datasets.The experimental results show that the classification accuracy of the proposed method is improved on the three datasets,especially on the NSFU-RESISC45 dataset achieves the classification accuracy of 94.16%,which is the best classification result of the dataset so far.And the classification accuracy proves the effectiveness of the proposed method.In summary,the work contents and the proposed method in this paper have important theoretical and practical value for solving the problems of image representation and classifier construction in remote sensing image scene classification. |