Font Size: a A A

Research On Object Detection Algorithm Based On Feature Pyramid Fusion And Attention Mechanism

Posted on:2024-04-13Degree:MasterType:Thesis
Country:ChinaCandidate:T H DingFull Text:PDF
GTID:2568307058476164Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
As an important task in the field of computer vision,object detection aims to quickly and accurately determine the location and category of objects.However,current object detection algorithms cannot well achieve the trade-off between detection accuracy,speed,and model complexity in complex scenes.Therefore,two object detection algorithms are proposed in this thesis to better balance the accuracy,speed,and model complexity of object detection algorithms.In the simplified model structure,the feature extraction and feature representation capabilities of the algorithm model are improved by introducing the technology of feature pyramid fusion and the idea of attention mechanism.The main research work of this thesis is as follows:(1)Aiming at the low detection accuracy of the anchor-free detection algorithm and the slow detection speed of the anchor-based detection algorithm,this thesis proposes an anchor-free detection algorithm based on feature pyramid fusion and attention mechanism,called FABNet.The algorithm model is mainly composed of a feature pyramid fusion module,cascade attention module,and boundary feature extraction module.First,a feature pyramid fusion module is designed to enhance the feature extraction capability of the backbone network and generate fused features with rich semantic information,thereby solving the detection problem of small objects and improving the detection accuracy of objects.Second,the cascaded attention module effectively improves the feature representation ability of the model by utilizing hierarchical attention,spatial attention,and channel attention to enhance the local representation of features.Finally,the boundary feature extraction module is designed to effectively extract the boundary features of the object,so as to obtain more foreground information in the complex background.The experimental results show that: FABNet achieves a trade-off between detection accuracy and detection speed on BDD100 K,PASCAL VOC,and KITTI public datasets with m AP of66.4%,89.0%,77.64%,and FPS of 28.9%,28.4%,12.53%,respectively.(2)Aiming at the high complexity and slow convergence of the Transformer-based object detection algorithm,this thesis proposes an object detection algorithm based on an encoder-only Transformer,called Deo T.The algorithm model is mainly composed of a feature pyramid fusion module and an encoder-only Transformer module.First,a feature pyramid fusion module is designed to generate fused features with strong semantic information to achieve the local perception of features.Second,the encoder-only Transformer module uses deformable multihead self-attention to achieve the global representation of features.In addition,the Transformer residual block refines the input and output features of the Transformer module by combining channel attention,spatial attention,and deformable multi-head self-attention,which effectively alleviates the complexity and convergence problems of the model,and achieves the improvement of detection accuracy.The experimental results show that Deo T achieves the fastest learning convergence speed and the best detection performance on the MS COCO dataset and Cityscapes dataset with 34 Epochs,50.9% and 30.1% AP,30 and 46 FPS,respectively.This not only achieves efficiency in the training process but also ensures real-time accuracy in the detection process.
Keywords/Search Tags:Object detection, Feature pyramid, Attention mechanism, Transformer
PDF Full Text Request
Related items