Font Size: a A A

The Technical Research On Multi-scale Object Detection In YOLO Detection Models

Posted on:2024-07-14Degree:MasterType:Thesis
Country:ChinaCandidate:M Z WangFull Text:PDF
GTID:2568307157451564Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Object detection is an important branch of computer vision,whose main task is to classify and locate multiple objects in an image.In recent years,with the development of deep learning,deep object detection networks have achieved great success.It is widely used in unmanned driving,intelligent transportation,industrial inspection and other fields,and has important research value and application value.Among them,the one-stage YOLO series detection algorithm has become a mainstream deep detection method due to its simple structure and fast detection speed.However,YOLO has certain errors in multi-scale object detection due to changes in object size and position;especially when dealing with dense small objects,there are usually problems such as missed detection and false detection.In dealing with multi-scale object problems,this thesis sorts out the development mainline of the YOLO detection algorithm,takes the structural characteristics of the multi-layer convolutional network as the starting point,proposes a series of improved algorithms,and verifies their effectiveness.The main work of this thesis is briefly described as follows.1)This thesis investigates the background and significance of object detection and provides an overview of the current state of object detection research both domestically and abroad.It details various existing algorithm types and conducts theoretical research and introduction on several representative algorithms.Subsequently,the YOLOX object detection algorithm used in this thesis is elaborated in detail.Finally,commonly used evaluation metrics for object detection algorithms are introduced.2)Feature pyramids,by integrating feature information at different levels to promote model detection performance,have become a basic solution to the problem of insufficient multi-scale object detection performance of YOLO series networks.However,the previous rigid splicing method often cannot reflect the synergy between different features.Therefore,this thesis proposes an adaptive feature matching technology,whose main goal is to strengthen the collaboration ability between differential features by highlighting and weakening the spatial information of different features.This method takes the spatial attention mechanism as the basic idea and uses the softmax function to produce differential masks to allocate the contribution of different scale features to terminal prediction.As a plug-and-play module,this thesis uses YOLOX as a benchmark model for instance verification.The detection performance on different datasets shows that this method can significantly improve the adaptability of the model to different scale targets.3)In response to the issue of insufficient detection accuracy of objects at different scales in YOLO series networks,this thesis proposes countermeasures from the perspectives of feature extraction networks and loss functions.Firstly,the context transformer is used to optimize the feature extraction network to enhance the model’s feature representation ability.Secondly,the existing IoU loss cannot optimize the situation where the predicted box and the ground truth box do not overlap,and only considers the optimization of localization loss from the perspective of overlap,which cannot effectively address the multi-scale problem of objects.Therefore,this thesis proposes the CSIoU loss,which is based on SIoU and adds a constraint term for the aspect ratio of the target to improve the detection performance of the model for targets with different scales.Implementation on the YOLOX model shows that the algorithm proposed in this thesis can achieve better detection performance on both the self-made trademark dataset and the general dataset VOC.
Keywords/Search Tags:Object detection, Multi-scale features, Feature matching, Contextual transformer, Localization loss
PDF Full Text Request
Related items