| In the field of computer vision,object detection is undoubtedly the hottest research direction.It has important application prospects in medical image processing,automatic driving,and epidemic personnel flow monitoring.In recent years,due to the continuous development of computer hardware technology and the emergence of deep learning,traditional and conventional object detection algorithms are gradually being replaced.YOLOv5 is a deep learning-based object detection algorithm,which has the advantages of fast detection speed and high real-time performance compared with other target detection algorithms,but it also has some defects,such as: occlusion target and small target detection,its detection accuracy is not high.Therefore,taking the YOLOv5 s model as the baseline,three improvement schemes are proposed.1)Based on the HS-Res Net residual module,the original YOLOv5 backbone network was reconstructed,and the weighted feature fusion strategy of BIFPN was introduced into the neck network to improve the original feature fusion method and enrich the semantic information of the obtained features.In the detection layer,the number of inspection heads is extended from 3 to 5.The improved network model was experimented on the MS COCO dataset and the PASCAL VOC dataset,and the AP values were increased by 4% and 3.3%,respectively.2)The HS-Res Net residual module was structurally optimized,and the original YOLOv5 backbone network was reconstructed based on the optimized HS-Res Net residual module.The spatial pyramid mechanism is introduced into the Gaussian context transformer,and it is added to the neck network with the multipath visual transformer to enhance the features extracted by the backbone network.In the detection layer,the number of detection heads is extended from 3 to 4.The improved network model was experimented on the MS COCO dataset and the PASCAL VOC dataset,and its AP values increased by 4.7% and 2.4%,respectively.3)The optimized HS-Res Net residual module is fused with multi-axis self-attention to reconstruct the YOLOv5 backbone network.The neck network is improved by introducing multi-axis self-attention into the Gaussian context transformer with spatial pyramid mechanism.At the sensing layer,the existing coupling head is replaced by a streamlined decoupling head.The improved network model was experimented on the MS COCO dataset and the PASCAL VOC dataset,and its AP values increased by 2.2% and 2.4%,respectively. |