Font Size: a A A

Research On Key Techniques Of Visual Semantic Segmentation Of Street Scenes

Posted on:2021-01-15Degree:MasterType:Thesis
Country:ChinaCandidate:H W HuFull Text:PDF
GTID:2492306503971859Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
Semantic segmentation is a basic task in the field of computer vision and it can parse the content of the scene.This thesis studies key techniques of visual semantic segmentation of street scenes,including two parts: utilizing depth information of scenes to improve accuracy of semantic segmentation models and accelerating semantic segmentation models.Existing street scene semantic segmentation approaches mainly use color information of the scene to perform pixelwise classification,which is prone to misclassifications caused by intra-class inconsistency and interclass similarity.This thesis proposes to exploit depth information to alleviate aforementioned misclassifications,and incorporates depth as a priori information or supervising information into the semantic segmentation framework.When depth is used as a priori information,convolutional neural networks are exploited to extract features of RGB image and depth image,which are then fused in four ways to perform semantic segmentation.When depth is used as supervising information,it is regarded as learning targets along with sample ground truth,so that the model can simultaneously perform two tasks of semantic segmentation and depth regression,and extracts more robust features.Experiments show that street scene semantic segmentation models utilizing depth information can achieve higher accuracy.An important limitation of semantic segmentation is the high demand for computing power.Therefore,many models are difficult to apply in practical environments where computing resources are limited.In order to reduce the complexity of models and the consumption of computing resources,this thesis accelerates semantic segmentation models through three methods:lightweight model design,knowledge distillation and model quantification.The lightweight model can extract multi-scale and global features,and includes spatial attention modules and a concise decoder.The model achieves a balance between model complexity and accuracy,and completes realtime semantic segmentation.Knowledge distillation transfers the knowledge learned from complex models to lightweight semantic segmentation models.This thesis proposes a loss function that can model global relationship between pixels for knowledge distillation,which can further improve the accuracy of models.Model quantization converts the numerical format from floating point to integer,which greatly reduces the storage space of models,improves the inference speed,and only loses a small amount of accuracy.Experimental results demonstrate the effectiveness of the three methods.
Keywords/Search Tags:Deep Learning, Semantic Segmentation, Scene Parsing, Knowledge Distillation
PDF Full Text Request
Related items