Progresses of aerospace technology,sensor technology,and communication have boosted the rapid development of remote sensing technology in the direction of high spatial resolution,high spectral resolution,high time resolution,multi-polarization,multi-angle,and small agility.Among them,the very high-resolution(VHR)remote sensing image obviously affects the country’s military and economic development.It has been widely used in military reconnaissance,earth surveying and mapping,ocean surveillance and meteorological observation.The development of remote sensing technology has promoted humans to obtain remote sensing image data faster,shorter cycle,and time-sensitive.Therefore,the number of remote sensing images has also begun to increase exponentially.This situation has prompted scholars to study and explore how to quickly interpret a large number of remote sensing images from ‘data to information’.Extracting more effective information that is instructive to interpretation of these remote sensing images has become an urgent problem to be studied.Deep learning is constantly updated,and as a result,it has promoted a batch of revolutionary breakthroughs in the field of artificial intelligence.We can understand that deep learning is a stacked layer,which uses non-linear processing to learn multiple levels of feature representation.Since deep learning can extract high-level semantic feature representations from a large amount of raw data,it provides good solutions to many practical applications,which makes it a popular machine learning algorithm in recent years.How to use deep learning to intelligently and quickly extract effective information on massive remote sensing images is one of the current research hotspots.The dissertation takes high-resolution remote sensing images as the research object.Based on deep learning,we summarize and study the problems existing on the classification tasks of remote sensing image interpretation—scene classification and object detection.The innovations include:1.Since remote sensing image scenes such as complex scenes,multiple categories,are easy to be confused,a fast deep perception network(FDP-Net)is proposed.FDP-Net is a remote sensing image scene recognition framework based on deep convolutional neural network(DCNN)and broad learning system(BLS).The framework uses DCNN to extract both deep and shallow features and encapsulates a designed the deep perception model(DPModel)to fuse the two kinds of features.FDP-Net first extracts the shallow and the deep scene features of a remote sensing image through a pre-trained model on DCNN.Then,it feeds the two kinds of features into a designed DPModel to IV obtain a new set of feature vectors that can describe both higher-level semantic and lower-level space information of the image.The DPModel is the key module responsible for dimension reduction and feature fusion.Finally,the obtained new feature vector is input into BLS for training and classification,and we can obtain a satisfactory classification result.FDP-Net can effectively solve the problems of high similarity between remote sensing image classes and large intra-class differences,and has achieved better generalization performance and algorithm complexity than those of traditional remote sensing image scene classification algorithms.2.In remote sensing image object detection,small targets have few pixels,are easily disturbed or occluded by other objects,as well as the difficulty of labeling in remote sensing image object detection.For this reason,a model that combines Sig-NMS-based Faster R-CNN with transfer learning is proposed.Sig-NMS replaces traditional non-maximum suppression(NMS)in the stage of region proposal network and decreases the possibility of missing small targets.Transfer learning can effectively label remote sensing images by automatically annotating both object classes and object locations.We carry out an experiment on three data sets of VHR remote sensing images to validate our model,and the results demonstrate that the proposed approach can effectively detect small targets in the VHR remote sensing images.3.In order to quickly and accurately find object in large-scale remote sensing images with complex backgrounds and rich information,a multi-scale spatial attention region proposal network for remote sensing images(MSA-RPN)is proposed.The network is an end-to-end system.It uses the residual network as the backbone network,extracting multiple scales of remote sensing image feature expressions from the network,and proposing a scale-specific feature gate module(SSFG)to extract features of the smaller size from remote sensing images.Combined with the attention mechanism,the spatial attention-guided model(SAGM)is proposed to obtain the attention feature map of each scale in the obtained multi-scale feature map.A series of experiments are accomplished on the challenging remote sensing image dataset,and the results demonstrate that MSA-RPN significantly improves the recall rate and speed of region proposal of remote sensing image,especially for small target.4.Remote sensing images have complex texture,obscured background,which make it difficult to quickly locate and recognize the object,especially small object.A network based on BLS and multi-scale fusion analysis is proposed,named as DMAB-Net(Deep Multi-scale Attention and Broad learning system Network).First,the network obtains the region proposals of remote sensing images based on MSA-RPN.Then,it takes the channel attention mechanism to obtain and fuse feature information of multiple scales.Finally,the BLS based on Bayesian optimization is used to recognize the target,and adaptively learn hyperparameters,which can be adapted to different remote sensing image dataset.DMAB-Net is applied to three challenging remote sensing image data sets,compared with state-of-the-art remote sensing image target detection algorithm,the experimental results of the DMAB-Net are improved about 2%~5%,and DMAB-Net is more robust. |