Font Size: a A A

Multi-level Self-supervised Image Representation Learning Based On Attention Feature Fusion

Posted on:2024-02-22Degree:MasterType:Thesis
Country:ChinaCandidate:F ChenFull Text:PDF
GTID:2568307094981609Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In computer vision,general and high-quality image representations are important for improving the performance of deep neural network models and solving downstream tasks such as image classification and target detection.Moreover,the limitations of current supervised learning,which requires large amounts of accurate image annotation information,have made self-supervised image representation learning methods a research topic of great interest.However,most self-supervised image representation learning methods focus mainly on high-level features and ignore the impact of low-level features with more detailed information and better transferability on learning image representations.In addition,under unsupervised conditions,although the general attention mechanism enables model training to focus on local regions in features,it also affects the learning of global features and reduces the generalization ability of image representations.Therefore,in order to effectively utilize the abstract properties of multi-level features in the neural network and improve the quality and generalization of image representations,this paper researches the multi-level self-supervised image representation learning method based on attentional feature fusion,which mainly includes as follows:(1)To improve the image representation quality and generalization ability,a multi-level self-supervised image representation learning method based on threeway attention fusion and local similarity optimization is proposed for different level features in the model.First,the channel attention weights of different layers of features in the model are extracted by using the three-way attention feature fusion module,and the weighted features are fused to improve the information content and transferability of the features.In addition,considering the effects of redundant and irrelevant information on feature quality,a local similarity optimization strategy based on mutual information is designed to optimize the fused features and improve the image representation quality.Finally,experimental results on four public datasets,CIFAR10,CIFAR100,STL10 and Tiny Image Net,demonstrate the effectiveness of the proposed method in self-supervised image representation learning.(2)To further improve the generality of image representation,a restricted attentional feature fusion network model for self-supervised image representation learning is proposed to address the problem that the feature fusion module in(1)has too much attentional weight and affects the self-supervised model to learn global features.Firstly,a new feature fusion strategy with dual attention mechanisms of channel and space is adopted to effectively fuse multi-level features.Secondly,a simple but effective attention weight matrix is designed to limit higher spatial attention weights and prevent model training from focusing only on local features with high attention weights.Finally,experimental results on four public classification datasets CIFAR10,CIFAR100,Tiny Image Net and Image Net-1%,two target detection datasets PASCAL VOC and COCO,and the ancient architecture dataset show that the proposed method has better representation performance and generalization ability.(3)Based on(1)and(2),the proposed model algorithms are summarized and packaged,and an image representation learning system with multi-level selfsupervised image representation model training,fine-tuning and visualization is designed and implemented based on Py Qt5 and Py Torch frameworks.
Keywords/Search Tags:Image representation learning, Self-supervised learning, Feature fusion, Attention mechanism, Mutual information, Attention weight matrix
PDF Full Text Request
Related items