| As one of core tasks of machine vision branch in artificial intelligence,object detection has a profound impact on downstream tasks such as object tracking,pedestrian re-identification,and object segmentation.According to the detection object,it is mainly divided into two directions: general object detection and specific scene object detection.The former is for the detection and recognition of multiple and multi-category objects in the natural environment.The latter is the detection of specific objects in specific scenarios.This paper mainly studies the general object detection algorithm based on deep learning.General object detection has the characteristics of complex detection background,large scale changes of the object to be detected and complex contours.From the perspectives of feature extraction and feature processing,this paper proposes a solution to the problem that it is difficult to extract effective features for multiple types of objects in complex backgrounds,and it is easy to introduce background noise.From the feature processing perspective: this paper proposes self-guided dual attention.Firstly,the feature map spatial element attention is realized by constructing a low-dimensional spatial embedding to capture the spatial local context information.Then construct the channel attention of the same feature map under different receptive fields.Finally,the self-guided spatial attention and multi-receptive field channel attention are combined to reduce the noise interference under the complex background and highlight the effective information of the features.From the perspective of feature extraction: the improved variable convolution is used to achieve adaptive convolution sampling for objects with large scale transformation and complex contours.At the same time,for the current object detection algorithm due to the misalignment between the classification task and the localization task,it is sensitive to object missed detection,false detection,repeated detection and scale transformation.In this paper,a regression feedback mechanism is proposed to improve the existing algorithm from the perspective of network construction.Specifically,by introducing the spatial information of different depths of the regression subnet in the network into the classification subnet,the classifier can acquire the ability to perceive the spatial information of the object.And build a feedback loop to feed the classification results and localization regression results back to the classifier training process.It enables the classifier to learn the potential coupling between bounding boxes and objects classes during training.Considering that the regression feedback mechanism is based on the alignment of information between features,there is a lack of information integration within features.This paper proposes dimension-related attention.By establishing the correlation matrix of channel dimension elements and spatial dimension elements,the attention to the correlation of information between dimensions is formed,and the information of high correlation is enhanced to suppress the information of low correlation.In this paper,the Anchor-free based FCOS(Fully Convolutional One-Stage Object Detection)algorithm is used as the framework to verify the effectiveness of the method on the large data set MS COCO.The experimental results show that the method proposed in this paper can effectively improve the detection accuracy of the algorithm,and the improved performance is better than the current mainstream algorithms. |