| Object detection is one of the most fundamental and important tasks in the field of computer vision,which determines the specific location of objects through images and automatically and accurately identifies the class of the objects.It is of great significance for both civil and military purposes.Existing object detection tasks mostly rely on the distribution and quantity of sample data,requiring sufficient labeled data to support the detection effect.However,this can introduce high costs,and sample data for many objects is relatively scarce,making it difficult to obtain a large amount of labeled data.Therefore,it is extremely important to use very few labeled data to learn models with certain generalization capabilities.At the same time,the performance of object detection for objects of different sizes varies greatly,as small objects have fewer available features and are highly susceptible to environmental interference.There are also problems such as object occlusion and dense connection of objects,resulting in poor performance of existing algorithms in small object detection.Therefore,in response to the problems of low accuracy,poor robustness,and poor real-time performance of existing algorithms for detecting few shot and small objects,this paper uses deep learning technology to make the following improvement research based on PP-Shi Tu:(1)Addressing the issue of low accuracy in few shot object detection using existing algorithms,this paper proposes a new backbone network DLCNet based on the foreground detection part of baseline.By using learnable fusion factors,different convolutional blocks are assigned different connection weights,learning the coarse and fine grained information of the image,and continuously iterating to find the optimal solution to ensure the maximum performance of the network.At the same time,this article also proposes a new attention mechanism to form a new network structure TCSP,integrating relevant features between space and channels,achieving the fusion of local and global features,and enriching the expression ability of the final feature map.At the same time,various data augmentation methods were introduced,combining fine-tuning,distance metric learning,data enhancement,ultimately establish an integrated few shot object detection algorithm,greatly improving the accuracy of few shot object detection.(2)Addressing the issue of low accuracy in small object detection using existing algorithms,this article introduces Vision Transformer in the foreground detection section and constructs a new network structure Concat Net to capture multi-scale information,effectively capture local features and global contextual information of objects,and better model objects of different sizes,especially small objects.In addition,in the feature extraction section,this paper proposes a new network called Twofold Net using depthwise separable convolution to strengthen the originally simple baseline network structure,generate more feature maps with fewer parameter quantities,improve the learning ability of the network,obtain dense potential representations,and improve the accuracy and robustness of small object detection algorithm. |