| As we have successively launched several high resolution remote sensing satellites in recent years,such as “GF-1”,“GF-2”,“GF-3”,“GF-4” and “SuperView-1”,this leads to the blowout growth of remote sensing image data,and implies our entry into the era of remote sensing big data.At present,these high resolution satellite remote sensing data have widely been applied many fields,for example,land use,resource enviorment investigation,ecological restoration,urban construction and homeland security.The following question is,in the face of the storage,management,retrieval and classification of remote sensing big data,how to extract features from high-resolution remote sensing images and represent high-resolution images more effectively has become a new challenge for the application of high-resolution remote sensing images.Object-oriented image analysis has become the main method of high-resolution remote sensing image processing,and feature extraction is experiencing a new era from traditionally handcrafted feature extraction to data-driven feature learning in recent years.In particular,in 2012,Krizhevsky et al.proposed the Convolutional Neural Network(CNN),which is a milestone work in the field of computer vision.Under the background of deep learning,we investigate the feature representations of remote sensing classification and retrieval:(1)How to learn more discriminative image representations based on deep features(e.g.,features extracted from the fully connected layer of a pretrained CNN is a key factor in achieving high remote sensing image classification accuracy.(2)Features from the convolutional layer of a CNN also contain abundant image information,how to extract effective descriptions from convolutional features is worth further studying.(3)The accuracies of CNN on small-scale datasets tend to saturated,therefore research based on large-scale datasets,such as transfer learning and representation of deep models,and low dimensionality embedding are important to further improve the performance of classification or retrieval.(4)Previous works have proposed numerous methods based on hash mapping to solve problem of retrieval of massive data.However,most methods adopt hand-crafted features,and hash is seldom investigated in remote sensing community,it is necessary to study hash-based remote sensing image retrieval.The mainly works and innovations are as follows:(1)Feature extraction using fully connected layers of deep CNN and discriminative convolutional kernel learning: A supervised convolution kernel learning method called DCK(Discriminate Convolutional Kernel)has been proposed to improve the separability of features extracted from the full connected layers.First,the 4096-dimensional features extracted from full connected layers of CNN are rearranged into the form of a two-dimensional image(e.g.,64 × 64 pixels),from which a spatial arrangement of local patches is extracted using sliding window strategy.Then,Single discriminate convolutional kernel learning is then performed on extracted local patch by using the supervised criterion respectively based on the minimum of within-class and the maximum of between-class.Finally,for each local patch,the learned DCKkernel is used for transformation of features.Experiments on two remote sensing datasets demonstrate the effectiveness of the proposed DCKin improving the classification performances of deep features without increasing dimensionality of features and training of linear classifier.(2)Local descriptors based on deep learning and feature encoding: The local descriptor extraction of features obtained by convolutional layers of CNN is studied and two aggregate strategies are proposed at descriptor level and middle feature level.First,two types CNN with different depth,CaffeNet and VGG-VD16,are introduced and all of the fully connected layers in the two CNN are removed.Second,the image pyramid is constructed and used as the input of CNN model to extract different features from convolution layers at different scales.Then,using the number of channels in the convolution feature map as feature dimensions,single local descriptors can be obtained by concatenating the convolutional features in each spatial position.Hellinger kernel and principal component analysis are introduced to each descriptor for further transformation.Finally,the proposed aggregate strategies are used to obtain global image representations.Experiments on two remote sensing image datasets illustrate that the deep local descriptor based on image pyramid combine with two proposed aggregate strategies may obtain higher accuracy than features extracted from fully connected layers.(3)Cross-dataset transfer learning and dimensionality reduction for deep features: Data is critical for deep learning,good data sometimes has more effects than designing a new CNN model.Therefore,we investigate recently published five large-scale remote sensing image datasets to conduct CNN research,and analyse the cross-dataset feature representation.Firstly,two different types CNN models namely CaffeNet and VGG-VD16 are introduced,then fine-tuning technique are performed on the pretrained CNN models using remote sensing image datasets.The representation ability of features obtained by two full connected layers of the fine-tuned model is investigated.Secondly,random projection(RP)is proposed to reduce the dimensionality of features from full connected layers.Experiments are conducted on remote sensing image retrieval and classification.Cross-dataset transfer helps in analyzing the generalization ability of existing datasets and provides a certain reference for other researchers in selecting domain-specific training data.The advantage of RP is that it has no learning process,so it can avoid learning the dimensionality reduction subspaces oflarge-scale remote sensing image datasets.(4)Deep Hashing coding based on fully connected layer : Features obtained by feature encoding or extracted from deep CNN are usually high dimensional.These will significantly increase the computational complexity of the distance measurement between images.Especially in the large scale image retrieval task,these greatly reduced the retrieval efficiency of the image,and the hard disk consumption for storingwill also be relatively large.The design of fully connected neural network from deep feature to binary code mapping is studied,a Fully Connected Hashing Neural Network(FCHNN)containing three fully connected layers is proposed and used for low-dimensional embedding of image feature.FCHNN is proposed for pairwise-supervised learning and aiming at mapping mid-level features(such as Fisher vector)and deep features extracted from pretrained or fine tuned CNN models into binary codes.Compared with deep hash networks of end to end framwork(pixel-to-binary),FCHNN has learning efficiency advantages.In consideration of storage space,when mapping 4096-dimensional features to 64 bits,it requires only 8 bytes.The retrival experiments on five remote sensing image datasets demonstrate that FCHNN can obtain effective coding results and desirable retrieval performance. |