| Image semantic segmentation is a fundamental computer vision task:given an image,a semantic segmentation model outputs the corresponding semantic label for each pixel in the image.Semantic segmentation is widely used in practice,such as detecting road signs,detecting tumors,etc.Most,if not all,of the state-of-the-art segmentation models are base on convolu-tional neural network(CNN),which is designed for image classification.In this the-sis,we found that there are two problems in previous semantic segmentation models.Firstly,the resolution of the convolutional feature maps in CNN is very important while considering both performance and efficiency.Secondly,there are some deficiencies in the context models of previous segmentation models.We proposed dense CNN and Vortex Pooling to tackle these problems,which achieve state-of-the-art performance on the PASCAL VOC 2012 benchmark.We also found that semantic segmentation models can be used to improve the accuracy of the Fine-Grained Visual Classification(FGVC)task.We have two obser-vations about this task.Firstly,the pose of the objects in different images varies a lot.Secondly,the background in each image is complicated.We proposed Mask-CNN to tackle these two problems,which achieves state-of-the-art performance on the CUB-200 2011 dataset and the Birdsnap dataset. |