| Thangka are a precious intangible cultural heritage with a history of thousands of years and have been handed down in the form of embroidery,painting or carving.The application of object detection methods to the study of Thangka can not only provide technical support for subsequent research on Thangka,but also give people who are new to Thangka a basic knowledge of them,which is more conducive to the transmission and dissemination of Thangka.The commonly used target detection methods require manually annotated datasets to train the models,however,the current Thangka datasets are small in size and the commonly used object detection algorithms are not effective in detecting the Thangka datasets.Therefore,this paper will investigate how to use few-shot object detection methods to solve the detection of key objects in Thangka images based on the in-depth mining of the visual and semantic features of Thangka images.The main research work and innovations in this paper are as follows:(1)Establishing a few-shot object detection on Thangka via multiscale context information and dual attention module(FSMC)combining multi-scale context information.Firstly,a new multi-scale feature pyramid is constructed using null convolution in the feature extraction phase of the algorithm,which can extract rich image context information.Secondly,a dual attention guidance module is introduced at the end of the feature pyramid to model the dependencies between features in both spatial and channel dimensions to enhance the representation of important features and further reduce the impact of redundant information on the detection results.Finally,a supervised classifier with RSLoss instead of cross-entropy loss function is used to reduce misclassification of similar objects and effectively improve the classification performance of the network.The m AP reached 19.7%and 11.5% under the 6-way 10-shot experimental mechanism on the Thangka dataset and the COCO dataset,respectively;and 24.4% and 14.9% under the 6-way 20-shot experimental mechanism,respectively.(2)Establishing a improved FSCE model based on G-bneck(G-FSCE).This chapter addresses the problem of the large computational size of the model FSMC,and redesigned the model by combining the ideas of light-weight model design.Firstly,a new feature extraction network is designed using the G-bneck module to replace the contextual feature extraction module in the FSMC model.Alarge number of features were obtained using the expanded feature map calculation in the G-bneck module,and then fused with the original input features to maximise the retention of important contextual information and to make the model more simplified.Then,a lightweight ECA attention module is employed to enhance feature representation in the channel dimension and filter redundant features to further improve the accuracy of the model detection.Finally,it is experimentally demonstrated that the model complexity of the proposed method in this chapter is much lower than that of the FSMC method and achieves an AP50 of 41.7% in the 6-way 20-shot experimental regime,which is a better performance compared to the FSCE method. |