Thangka,as a shining pearl in the snowy Tibetan art temple,provides extremely important materials for the study of Tibetan culture.In the past few decades,the digitization of Thangka images and its associated visual tasks have greatly facilitated the conservation and dissemination of Thangka.Among them,the retrieval of portrait Thangka images has received much attention due to its important potential value in Thangka appreciation,comparison,and identification.Specifically,given a query image and a Thangka image index library,a candidate list is required to be generated,which is arranged in descending order according to the similarity between the images in the index library and the query image.Thangka retrieval tasks face many challenges.On the one hand,there is currently a lack of portrait Thangka image datasets with a certain scale,accurate annotations,and detailed category division,which hinders the full potential of deep learning.On the other hand,related research has not optimized feature extraction models in conjunction with the characteristics of Thangka images.In addition,the objectives of Thangka retrieval tasks are also relatively singular,as the similarity between images is measured by whether the identity of the figure is consistent,without considering the attribute of painting school.Based on the above situation,this dissertation explores retrieval methods based on different attribute features around the problem of portrait Thangka image retrieval,and carries out the following four aspects of work:(1)This dissertation collects and constructs a portrait Thangka image database with a scale of over 8000 images,and annotates two attributes of the Thangka images in the database:the identity of the figure and the school of painting.In addition,in order to alleviate the impact of complex backgrounds of the image on retrieval accuracy,the figure is cropped by the object detection model,and the robustness of the model to the image style is improved through multiple data augmentation methods.(2)This dissertation proposes a Multi-Granular Feature Learning network to retrieve portrait Thangka images in the way of priority of the figure’s identity.Combined with the characteristics of the regular composition of the portrait Thangka,local attention is guided by splitting feature maps.To capture key information with different granularity,the perceptual feature pyramid is used to give full play to the role of the shallow feature maps.To solve the problem that features are isolated from each other and cannot interact,the method of step-wise graph convolution is adopted to transfer messages between features,and the feature vectors of multiple granularity and multiple regions are fused.Compared with other models,the network has significant advantages in the task of portrait Thangka retrieval.(3)This dissertation proposes a Style Feature Representation Network to retrieve portrait Thangka images in the way of school priority.The images with school label are augmented by the way of neural style transfer.In addition,the role of the Gram matrix in the representation of style features has also been fully explored.The content features extracted by the backbone are converted into style features,and the retrieval results are optimized together with agglomerative clustering.The network complements the blank of Thangka school retrieval task,and the results are superior to the current advanced model.(4)This dissertation proposes a multi-attribute Thangka retrieval network.In a single model,the portrait Thangka image retrieval is carried out in the way of different attributes priority at the same time,which improves data utilization and retrieval performance while ensuring the accuracy of the two retrieval paths. |