Font Size: a A A

Research On Clothing Parsing Based On Multi-scale Fusion And Positional Attention Networks

Posted on:2022-11-07Degree:MasterType:Thesis
Country:ChinaCandidate:Y HuFull Text:PDF
GTID:2481306779964219Subject:Computer Software and Application of Computer
Abstract/Summary:PDF Full Text Request
The rapid development of e-commerce has driven related research centered on clothing.Clothing parsing is an important branch of the clothing field,which assigns predefined semantic labels to each pixel in an image,thus dividing the image into multiple semantically consistent regions.Since these regions can provide high-level semantic information such as background,clothing category,position and shape,clothing parsing has become a key technology for implementing and improving various clothing applications.Due to the diversity and variability of clothing and scenes,clothing parsing is very complex and time-consuming.Moreover,due to the large number of clothing images,traditional manual annotation can no longer meet the automation requirements of current clothing applications.Therefore,how to automatically parse clothing images is becoming an important part in the current image segmentation research field.To improve the lack of positional and global prior in current clothing parsing algorithms,this paper proposes a clothing parsing algorithm based on contrastive learning,positional attention,and global prior.The main contents of the research are as follows:(1)This paper applies the idea of contrastive learning from the unsupervised domain to the fully supervised clothing parsing task,and proposes a contrastive learning-based clothing parsing algorithm to address the problem that existing parsing algorithms often ignore the potential global information of the dataset.By constructing positive and negative sample sets,a joint optimization method of contrastive loss and cross-entropy loss is used to force pixels of the same class close in the embedding space and pixels of different classes apart,improving intra-class compactness and inter-class separability.(2)An enhanced positional attention module(EPAM)is proposed for the characteristics that the class distribution of clothing is highly dependent on the vertical position.The module extracts contextual information in the vertical direction of clothing images and then uses this information to compute attention weights to estimate how channels should be weighted during pixel-level classification of the clothing parsing,enhancing the spatial modeling capability of the network.(3)A global prior module(GPM)is proposed to improve parsing accuracy loss caused by large differences between the scales of clothing items.The module captures different levels of details in an image at multiple scales and improves the ability of the model to extract multi-scale features by contextual aggregation of different regions.The paper validates the above researches on standard clothing parsing datasets,achieving51.12% of mean Intersection over Union(m Io U)and 92.79% of Pixel Accuracy(PA)by the improved parsing algorithm of EG-Res Net based on positional attention and global prior.In addition,the paper further improves the parsing accuracy of the model to 51.46% of m Io U and92.83% of PA by combining the internal and global information of the images.The experiments demonstrate that the algorithm proposed in the paper can accurately parse the clothing items in the images.It effectively solves the problem that the segmentation accuracy of the existing clothing parsing algorithms is unsatisfactory due to less consideration of the intrinsic features of the image.It also has better applicability and practicality in practical clothing applications.
Keywords/Search Tags:clothing parsing, convolutional neural network, contrastive learning, attention mechanism, positional prior
PDF Full Text Request
Related items