Font Size: a A A

Research On Multi-Object Multi-Part Parsing Method Integrating Semantic Perception And Contrastive Learning

Posted on:2024-03-05Degree:MasterType:Thesis
Country:ChinaCandidate:G Q LiuFull Text:PDF
GTID:2568307067958369Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,the rapid development of deep learning technology has brought new opportunities to image processing,especially in object parsing,image segmentation,and image understanding.Object parsing is a complex and fine-grained image segmentation task in the field of computer vision,while multi-object multi-part parsing is its more challenging variant.It not only requires the segmentation of different objects appearing in the scene,but also the segmentation of semantic parts within each object,in order to achieve a comprehensive understanding of all objects in the image.This has not only significant theoretical significance,but also practical value.The application scenarios of multi-object multi-part parsing are very extensive,such as robotic arm operations in the field of robotics,scene understanding in the field of autonomous driving,environmental perception in the field of smart homes,and even target recognition in the field of military security.The study of multi-object multi-part parsing not only provides a solid foundation for achieving these application scenarios,but also promotes the development of computer vision.This paper addresses the shortcomings of multi-object multi-part parsing and proposes corresponding solutions.The main content and contributions of this paper are as follows:(1)Introducing attention mechanism in boundary-aware algorithm to construct an auxiliary branch,using boundary-aware algorithm to detect part boundaries in the feature extraction stage,then enhancing the extraction of boundary features with attention mechanism and integrating them into the main features,enabling the model to better solve boundary ambiguity problems.(2)Propose an object semantic perception module.Since parts of different semantic classes are unrelated,each part depends heavily on the corresponding semantic class.This paper uses the object semantic perception module to extract corresponding object-level semantic information,providing object-level semantic information guidance for subsequent part segmentation.(3)Introduce a reconstruction loss function to model the relationship between objects in the image and their component parts.The reconstruction loss rearranges the parts into objects and applies additional punishment to parts that do not belong to the current semantic class.(4)Propose a pixel-level hierarchical contrastive loss function under fully supervised environment.By using the supervised contrastive loss,positive and negative samples are compared,so that samples of the same category are closer in feature space,and samples of different categories are farther in that feature space.In addition,by using object-level and part-level labels,hierarchical relationships are constructed and combined with the contrastive loss,and combined contrastive losses are applied hierarchically to enforce hierarchical constraints,thus generating a well-structured representation space.Comparative experiments and ablation experiments were conducted on the PascalPart-58 and Pascal-Part-108 datasets to compare the accuracy of the proposed model with mainstream multi-object multi-part analysis models.The proposed method achieved an average precision of 59.9% on Pascal-Part-58,an improvement of 1.7% over the BSANet baseline model,and outperformed the compared models.On Pascal-Part-108,the proposed method achieved an average precision of 47.0%,outperforming the baseline and compared models.Through the hierarchical contrastive loss,a representation space with a hierarchical structure was constructed,which greatly improved the problem of erroneous predictions for parts with similar appearance between different objects and within the same object.These results demonstrate the feasibility of the proposed multiobject multi-part analysis method.
Keywords/Search Tags:Deep Learning, Object Parsing, Semantic Segmentation, Attention Mechanism, Multi-object Multi-part Parsing
PDF Full Text Request
Related items