| With the explosive growth of the number of high-resolution remote sensing images,the demand for automatic remote sensing(RS)image interpretation has increased dramatically.RS image semantic segmentation task is usually a mapping step which generates semantic mappings containing various regions of interest based on some RS image data.For example,each pixel in the land cover map is assigned a category label according to the type of land cover(vegetation,road,...)or object(car,building,etc.)observed at the pixel.At present,most of the advanced algorithms for semantic segmentation task are based on the Deep Convolutional Neural Network(DCNN),which has excellent performance and high accuracy.However,there are still problems such as loss of high frequency details and blurring of object boundaries.On the one hand,combining the DCNN-based semantic segmentation model with structural information can reduce the impact of detail loss,so that the details of the segmentation results are more accurate.Since the labels of spatial neighboring pixels tend to be highly dependent,this structural information can be used to improve the accuracy of the label significantly.If the segmentation task is turned into an independent pixel labeling problem,not only will the complexity of pixel-level classification increase,but it will also lead to false labels and incoherent spaces.For example,the occlusions and shadows caused by high-rise buildings will lead to the low accuracy of pixel-level classification without any surrounding pixel information.On the other hand,the Generative Adversarial Network(GAN)has become a powerful framework in various tasks.Considering that the architecture of two networks against each other can effectively improve the network performance,the GAN architecture is applied in RS image semantic segmentation tasks in this paper.The generator network generates predicted segmentation.The discriminator network flexibly detects mismatches in various high-order statistics between predicted data and real data to generate adversarial loss,which will guide the training of the generator network to achieve a more accurate segmentation.Therefore,by applying the GAN structure to the semantic segmentation problem,the adversarial training can be used to evaluate the joint configuration of many tag variables and optimize the traditional loss function into a loss that can combine the traditional multi-class cross entropy loss with the adversarial loss.The segmentation model is encouraged to generate predictions that are sufficient to confuse the discriminator with the constraint of the adversarial loss.In this paper,an end-to-end RS semantic segmentation model is proposed,which combines the generative adversarial network with CRF into an integrated structure.The convolutional encoder-decoder architecture and the CRF are used as generators to generate predicted segmentations by converting the CRF median field into convolutional layers.Our model attempts to optimize the traditional multi-class cross-entropy loss and the the adversarial loss term generated by discriminant network.The discriminator determines whether the input is from the predicted segmentation or the Ground Truth(GT)to narrow the output of the generator as close as possible to the distribution of GT,so as to improve the segmentation performance.Since the discriminator is usually a convolutional neural network in which the joint configuration of multi-variables is taken into account,it is possible to take use of the high-order potential in the network.The autonomous training of the network can learn the geometric difference between the predicted segmentation map and the GT instead of the specific artificial high-order potential.In order to evaluate the effectiveness of the proposed model,we take experiments in three datasets:the Oberpfaffenhofen region near the German Aerospace Center(DLR),the ISPRS Vaihingen dataset,and GID proposed by Xia Guisong et al.The proposed model is compared with FCN,Deeplab,SegNet and PSPNet in terms of segmentation results,F1 scores and accuracy,and the results verifies the effectiveness of the proposed Conditional Random Field Adversarial Segmentation model(CRFAS)in the field of RS image segmentation. |