Font Size: a A A

Research On Object Detection Algorithm Based On Deep Learning

Posted on:2024-03-15Degree:MasterType:Thesis
Country:ChinaCandidate:X H ChenFull Text:PDF
GTID:2568307115478154Subject:Mechanics
Abstract/Summary:PDF Full Text Request
Object detection is one of the key technologies in computer vision.It is an extension of object classification and serves as the foundation for instance segmentation and semantic segmentation.It has extensive applications in people’s daily lives,social production and operation,as well as national defense.Recently,Transformer based on attention mechanism have shown great potential in the field of computer vision.However,their performance in object detection tasks still needs improvement.This paper conducts relevant research and proposes a novel object detection algorithm.Based on this,this paper further proposes a multi-task backbone network for object detection,instance segmentation,classification and semantic segmentation.The main contributions and innovations of this paper are summarized as follows:(1)This paper explores the performance of Transformer as the backbone for object detection,using YOLOX as the baseline algorithm.To address the problem of poor feature extraction caused by the attention mechanism in Swin Transformer,this paper proposes a new reconstructed deformable self attention based on important regions.Attention is shifted to these regions,and more local dense attention is assigned to these regions,allowing for global modeling of the objects and improving the ability and efficiency of long-range relationship modeling.This paper builds a new backbone network based on this reconstructed deformable self attention,improving the feature extraction capability and speed.To address the problem of poor multi-scale feature representation for the neck in YOLOX,this paper proposes a feature aggregation network based on Bi PAFPN,which sets different weights for the input multi-scale features to highlight the contributions of important features.Finally,the experimental results on the public dataset show that the proposed object detection algorithm has lower complexity and achieves advanced levels of precision and real-time performance.In real vehicle experiments,the model presented in this paper has demonstrated high precision and stronger generalization ability.(2)This paper further studies the performance of Transformer technology in the field of multi-task network and proposes a multi-task backbone network.The multi-task includes object detection,object classification,instance segmentation,and semantic segmentation tasks.To address the problem of non-edge interference in segmentation task,this paper proposes a key point deformable self/cross attention mechanism based on the reconstructed deformable self attention.This mechanism transfers attention to the important area around the key points for global modeling,refining the object edges and improving segmentation capability.To address the problem of poor feature extraction performance of Transformers,this paper proposes a feature extractor based on encoderdecoder architecture,and incorporates keypoint deformable self/crossattention mechanisms to enhance the feature extraction capability.To address the problem of multi-stage feature loss and attention decay in multi-task networks,this paper proposes to construct dense residual connection structure,so that each stage can receive the original input from all previous stages,to achieve feature reuse and enhance the attention of the network.Finally,experiments on object detection,instance segmentation,object classification,and semantic segmentation datasets demonstrate that the proposed multi-task backbone network achieves stateof-the-art results on various evaluation metrics.The results of real vehicle experiments show that the model has achieved the expected performance in actual scenarios.
Keywords/Search Tags:deep learning, Transformer, object detection, YOLOX, multi-task network
PDF Full Text Request
Related items