Font Size: a A A

Image Semantic Segmentation Method And Application Research Based On Deep Learning

Posted on:2023-10-12Degree:MasterType:Thesis
Country:ChinaCandidate:P Y ZhangFull Text:PDF
GTID:2558306623496604Subject:Engineering
Abstract/Summary:PDF Full Text Request
Semantic segmentation is a popular research direction in the field of computer vision in recent years.The essence is to classify different pixels in an image to achieve the purpose of computer autonomous recognition of objects.This research has a wide range of applications in robot autonomous perception,remote sensing image processing,medical treatment,autonomous driving,video surveillance,etc.In the era of big data,the analysis and processing of massive image data put forward a new urgent demand for the research and development of semantic segmentation methods.However,the object composition in the image data is complex,and the image contains objects with mutilpe scales and complex contours.The existing semantic segmentation methods are difficult to express the semantic features of multiple types of objects in a high degree of generality.There are lots of problems in the prediction results,such as loss of small targets,incomplete contours of objects,and mismatching of targets.In response to the above problems,this paper takes multiscale feature fusion as the guiding ideology,combines multi-scale feature extraction,attention mechanism,feature fusion and other methods to improve the accuracy of image classification,and carries out the following research:(1)Aiming at the problems of small target missing and object contour missing in image semantic segmentation and classification,proposing a double paths feature fusion network(DPF-Net)based on attention mechanism and feature fusion.The main work of the network is as follows: firstly,Res2 Net is used as the backbone network for feature extraction,and the different fine-grained features of the residual module are fused at multiple scales to improve the image recognition ability of the network;Second,the position attention module is added to the low-level feature extraction to improve the correlation between channels and solve the problem of accurate segmentation of object contour;Thirdly,in the part of feature fusion,the adaptive pyramid pooling module is added to fuse multi-scale features to solve the problem of missing small target objects in the prediction results.The above methods have a positive effect on the segmentation accuracy of the model.Through experiments,the segmentation accuracy of the proposed model on Pascal VOC dataset reaches 79.5% MIo U,which verifies the effectiveness of this method.(2)Although DPF-Net solves the problem of small targets and segmented object contour in image semantic segmentation to some extent,the semantic segmentation network also has hard requirements for real-time in the actual scene.Therefore,an improved real-time semantic segmentation network based on BiSeNet is proposed.Firstly,the improved lightweight resnet-18 is used as the semantic feature extraction network to simplify the model architecture,and a spatial feature branch is added to ensure the spatial details of the extracted model;Secondly,the DFFM is used to fuse the features of the two branches to generate the prediction results.Through experiments,the segmentation accuracy of the proposed model on cityscapes dataset reaches 70.3% MIo U and 63 FPS,which meets the speed and accuracy requirements of the real-time semantic segmentation model.(3)In order to test the generalization results of the model proposed in this paper,the self-built campus dataset verifies the generalization effect of DPF-Net;Camvid dataset and self-built Street View dataset are used to verify the generalization effect of the real-time semantic segmentation network based on improved BiSeNet.
Keywords/Search Tags:deep learning, semantic segmentation, attention mechanism, feature fusion, real-time semantic segmentation
PDF Full Text Request
Related items