| With the rise of deep learning,the field of object detection develops rapidly and is widely used in real life.However,some defects of object detection model,such as error detection and leak detection and low detection accuracy,restrict its further development.This paper takes YOLOv3 as the benchmark research model,analyzes the drawbacks of the model,and gives the corresponding research plan.The detailed research work is as follows:1)In view of the existence of error detection and leak detection in YOLOv3 model,the detection features are often interfered by other information in the process of network propagation,so a object detection model based on multi-scale feature fusion is proposed.Firstly,the defect that the image information will be lost due to the upsampling of the nearest neighbor interpolation is analyzed,and the improved upsampling method is designed.Then it is pointed out that the feature fusion method of feature pyramid pays more attention to the neighboring feature layer,which leads to the loss of the previous feature information and the introduction of extra noise.A feature fusion scheme is proposed to resolve the problem of information loss of feature fusion,and the availability of the scheme is verified by experiments.The above two improvements are integrated into YOLOv3 network and compared with the original YOLOv3 model,it is found that the model has improved detection accuracy and error detection and leak detection,and has certain universality.2)In order to further improve the detection accuracy of the algorithm in Chapter 3,an object detection model based on context enhancement is proposed.Firstly,in order to solve the problem that the original model did not distinguish the significant differences between images in different channels and positions and predicted the task objects without discrimination,a collaborative attention mechanism module was added to the network to apply attention to the feature images,boost the hinge area and suppress the redundancy.Then,in order to enrich the information of feature maps,the receptive field module is embedded in the network,and the semantic information of different receptive fields is fused by using the empty convolution with different expansion rates,so that feature maps have strong feature expression ability.The improved model was compared with other algorithm models in Pascal VOC dataset.Experimental results show that the proposed model achieves 85.6% m AP on VOC2007 test sets,which improves the model detection performance and further reduces the error detection rate.Finally,experiments were carried out on three datasets including SKU-110 K,CCTSDB and UAV,which further verified the strong generalization performance of the proposed model.Finally,the improved target detection algorithm is modularized and applied to the actual target detection system. |