Instance Segmentation Based On Multi-Scale Feature Fusion And Syncretic Non-Maximum Suppression

Posted on:2021-12-14

Degree:Master

Type:Thesis

Country:China

Candidate:Y Q Zhang

Full Text:PDF

GTID:2518306119470794

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Instance segmentation is an emerging comprehensive task in computer vision.This comprehensive task involves three subtasks: image classification,object detection,and semantic segmentation.Instance segmentation is a basic task for autonomous driving and 3D reconstruction.It has a good application prospect in the industry and has attracted the attention of many scholars and industrial persons.In recent years,with the development of deep learning,instance segmentation has been developed from scratch and gradually developed into two-step methods and one-step methods.The current instance segmentation algorithms generally have problems such as poor mask quality and inability to include complete objects in the bounding boxes.These problems are the key issues in pushing instance segmentation to applications.To solve the above two key problems,this paper combines the classic two-step instance segmentation algorithm,and starts from the aspects of multi-scale feature fusion and optimized nonmaximum suppression algorithms.The main work of this article is as follows:(1)An instance segmentation algorithm combining multi-scale feature fusion(MaskRefined R-CNN)is proposed to solve the problem of incorrect classification of object details and inaccurate segmentation mask.The scale-invariant fully convolutional network structure of traditional two-step instance segmentation networks ignore the difference in spatial information between receptive fields of different sizes.The network cannot consider the relationship between the pixels at the object edge,and these pixels will be misclassified.In this paper,based on the instance segmentation framework Mask R-CNN,the original fully convolutional semantic segmentation head network is replaced with a feature pyramid structure that can fuse high-level and lowlevel information,and the output size of ROIAlign is adjusted accordingly.Lateral connections are added to the pyramid to balance the amount of information passed forward and backward,and finally return images with the same resolution as the input images.The method can effectively improve the fineness of the segmentation mask,especially for larger objects in images,the mask quality will be significantly improved.Mask-Refined R-CNN can effectively improve the segmentation accuracy,and it performs well when predicting large-scale instances.(2)Syncretic non-maximum suppression algorithm for instance segmentation is proposed to solve the problem that the bounding box returned by object detection cannot completely contain instances.Semantic segmentation is conducted on the bounding boxes that are returned by detectors.The evaluation criteria for object detection require that the bounding box be as close as possible to the ground truth,but they do not emphasize the integrity of the included object.Therefore,the bounding boxes typically cannot completely contain objects,and the parts that are outside the bounding boxes cannot be correctly predicted in the subsequent semantic segmentation.To solve this problem,we propose the Syncretic-NMS algorithm.The algorithm obtains the bounding boxes that are returned by traditional NMS,judges the neighboring bounding boxes of each bounding box,and combines the neighboring boxes that are strongly correlated with the corresponding bounding boxes.The coordinates of the merged box are the four coordinate extremes of the bounding box and the highly relevant neighboring box.By judging the degree of correlation between the neighboring box and the corresponding bounding box,the neighboring box with strong correlation is merged with the corresponding bounding box.Based on an analysis of the influences of corresponding factors,the criteria for correlation judgment are specified.The computational complexity of Syncretic-NMS is the same as that of traditional NMS.Syncretic-NMS is easy to implement,requires no additional training,and can be easily integrated into the available instance segmentation framework.(3)The two algorithms proposed in this paper are evaluated on the MS COCO dataset and the Cityscapes dataset,and compared with the current mainstream instance segmentation algorithms.Experiments show that Mask-Refined R-CNN can effectively improve the precision of the segmentation mask,especially for larger objects in the image,the mask quality will be significantly improved.Syncretic-NMS can steadily increase the accuracy of instance segmentation,and the algorithm can adapt to application scenario changes.(4)The two algorithms in this paper are evaluated under an instance segmentation framework,and compared with the state-of-the-art instance segmentation algorithms.Experiments show that instance segmentation with multi-scale feature fusion and syncretic non-maximum suppression achieves the highest segmentation accuracy,which can not only include instances more completely,but also accurately segment instances in the bounding box.

Keywords/Search Tags:

Instance Segmentation, Feature Fusion, Non-Maximum Suppression, Object Localization, Deep Learning

PDF Full Text Request

Related items

1	Research On Object Detection And Instance Segmentation Based On Geometric Factors
2	Research And Application Of Real-Time Instance Segmentation Network Based On Deep Learning
3	Video Object Detection Based On Deep Learning
4	Research And Implementation Of People Detection System Based On Deep Learning
5	Research Of Single-Stage Object Detection Algorithms Based On Deep Learning
6	Research On Instance Segmentation Algorithm Based On Mask R-CNN
7	A Study On Multi-scale Feature Extraction And Optimized Non-maximum Suppression For Object Detection
8	Research On Object Detection And Instance Segmentation Algorithms Based On Deep Learning
9	Research On High-Performance Instance Segmentation Based On Deep Learning
10	Research On The 3D Point Cloud Segmentation Method Based On Deep Learning