With the rapid development of airborne sensor techniques,aerial images with high-resolution spatial information have attracted much attention.In order to analyze and utilize the aerial image datasets,it is necessary to label all pixels of different categories in the image.Thus,multi-label aerial image classification has become the premise of many practical applications.Various research demonstrate that the co-occurrence relation between different categories of objects plays an important role in this task.This paper proposes two methods to learn the co-occurrence relation:(1)A multi-label aerial image classification method based on pixel-object level co-occurrent relation learning network is proposed.Due to the influence of imaging angle,the features of pixels in the same-category object regions vary according to their positions.In order to learn co-occurrence relation more accurately,this paper proposes to simultaneously use pixel-level and object-level co-occurrence relation module.The pixel-level co-occurrence relation module is designed to measure the co-occurrence relation by using the feature similarity between pixels of different spatial positions.However,a single pixel can-not fully represent the whole object and the pixel-level co-occurrence relation may not effectively help the target pixel to determine its category.To alleviate this issue,this paper proposes an object-level co-occurrence relation module which measures the relation between objects in a global view.The final experiments show that the proposed method achieves good classification performance on both UCM and DFC15 public evaluation datasets.(2)A multi-label aerial image classification method based on the joint model of self-attention and graph convolution is proposed.In order to solve the problems: 1)ignore the co-occurrence relation between pixels of different channels;2)heavily depend too much on label prediction order in pixel-object level co-occurrence relation learning network,this paper proposes to use self-attention network to learn co-occurrence relation from both channel and space,and use graph convolution network which is insensitive to sequence order to predict multi label.Specifically,on the one hand,dual self-attention network is used to model the correlation between pixels in different space positions and feature maps,so as to further learn the implicit co-occurrence relation of intermediate features;On the other hand,in order to better learn the object level co-occurrence relation,the graph convolutional network is proposed to make use of the semantic correlation between labels.Experiments on two multi label aerial datasets show that the proposed method has the best performance and further verifies its effectiveness. |