Font Size: a A A

Research And Application Of Image Semantic Segmentation Algorithm Based On Deep Learning

Posted on:2024-01-24Degree:MasterType:Thesis
Country:ChinaCandidate:J D DuFull Text:PDF
GTID:2568307127454734Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Image semantic segmentation is a basic task in the field of computer vision.Its essence can be regarded as a multi-classification task.Each pixel is an object to be classified,and several consecutive pixels of the same category divide the image into several independent regions.In recent years,thanks to the hard work of many outstanding researchers,deep learning and computer vision have developed rapidly,and semantic segmentation has also been applied in production in areas such as autonomous driving,remote sensing images,and medical diagnosis.Since the full convolutional neural network was proposed,it has become the basic method of semantic segmentation.Many new network models based on this have achieved better segmentation results,but there are still some problems to be optimized.Based on the existing high-performance network model,this paper achieves better segmentation performance and generalization ability by analyzing its existing problems and proposing improvement strategies.The main work of this paper is summarized as follows:1)Aiming at the insufficient semantic association between the parameters of the Expectation Maximization Attention(EMA)algorithm and the image and the lack of attention to the inter-channel information,this paper proposes a dual attention network EMA+.This method designs 2 modules: spatial attention module and channel attention module.In the spatial attention module,the EMA algorithm is used as the main structure,and the feature map itself is used as the initial parameter in the EM algorithm in the responsibility estimation(AE)step,which increases the semantic association between the parameters and the feature map;in the channel attention module,the Efficient Channel Attention(ECA),by using one-dimensional convolution to learn the interactive information between channels,avoiding the destruction of the direct correspondence between channels and their weights due to dimensionality reduction operations.EMA+ significantly improves the performance of semantic segmentation tasks by fusing two attention modules,spatial and channel.The experimental results show that the EMA+Net proposed in this paper has achieved better mean intersection over union than EMANet and other competitive advanced methods on PASCAL VOC 2012 dataset and Cityscapes dataset,and has a comparative good generalization ability.2)In order to solve the problem that standard spatial average pooling and strip pooling cannot flexibly capture complete context information when facing long-distance strip structures and discretely distributed target objects,this paper proposes a diagonal strip pooling method,called X-stripe pooling,which makes up for the long-range information neglected by existing pooling operations,and constructs a more complete context encoding.Based on X-stripe pooling,this paper designs a plug-and-play X-stripe pooling module called XSPM.And add XSPM to the low-level features and high-level features of the Deep Labv3+ network to form a new network called XSPNet.Comparative experiments were performed on the commonly used datasets: PASCAL VOC 2012 and Cityscapes.The results show that the XSPNet proposed in this paper has excellent performance and better segmentation results.The ablation study of XSPM on the PASCAL VOC 2012 dataset shows that the XSPM proposed in this paper also has wide applicability.3)In order to apply the two network models EMA+Net and XSPNet proposed in this paper to practical problems,this paper first trains the method proposed in this paper and several other advanced networks on the remote sensing dataset Potsdam.Then,based on the Flask framework,a remote sensing image semantic segmentation system is designed and implemented,and the proposed method is applied to the remote sensing image segmentation task.Users can interact with the system in the browser,call the network model proposed in this paper and some other advanced network models,segment the uploaded remote sensing images through a simple and intuitive operation interface,and get the segmentation results and logs.The running results of the system show that the method in this paper has application value in practical problems.
Keywords/Search Tags:Deep Learning, Image Semantic Segmentation, Attention, Strip Pooling, Remote Sensing Image
PDF Full Text Request
Related items