Font Size: a A A

Research On Anchor-free Target Detection Technology Based On Efficient Feature Fusion And Feature Associatio

Posted on:2024-07-19Degree:MasterType:Thesis
Country:ChinaCandidate:T JiangFull Text:PDF
GTID:2568307106975749Subject:Electronic information
Abstract/Summary:PDF Full Text Request
As an important task in the field of computer vision,object detection aims to identify and determine the location and category of objects in images or videos.Traditional object detection algorithms typically rely on handcrafted features or heuristic priors,which are limited by its own scene complexity and feature expression ability.In recent years,with the development of deep learning,object detection algorithms based on deep learning have achieved great success.This paper explores the application of deep convolutional neural networks in object detection and further studies Anchor-Free object detection technologies based on deep learning.The research results are as follows:This paper proposes an Anchor-Free object detection method based on spatial perception and multi-attention weighted iterative fusion.In the existing keypoint-based U-shaped Anchor-Free object detection networks,low-level feature details cannot be fully exploited,decoding features have limited spatial perception and utilization ability of multi-scale information,and only simple addition of rough fusion method for feature fusion leads to low fusion efficiency and affects model performance.To overcome these limitations,the paper designs a primary feature extraction module,a global residual perception module,and a multiple attention iterative fusion module.First of all,the primary feature extraction module is utilized to obtain more detailed information on region representation from the low-level features extracted by the backbone network.Then,the global residual perception module employs a multi-level feedback structure to extract multi-level contextual information of high-level semantic features extracted from the last layer of the backbone network,significantly enhancing the multi-scale spatial perception capability of the network.In terms of feature fusion manner,the proposed multiple attention iterative fusion module selectively processes rough features,designs cross-attention to effectively focus on the position information of the object,and collects contextual information from different receptive fields through a two-stage iterative fusion to correct the problem of inconsistent features,guiding the efficient feature fusion of low-level features and high-level semantic features.By utilizing an optimized correction module and adding a regression quality prediction branch at the output end of the Anchor-Free network,a better detection score can be obtained by combining the branch results with the classification score using a new calculation method.At the same time,the method incorporates generalized intersection over union loss in the loss function to guide the network to predict more accurate regression results in a more efficient manner.Extensive experiments on standard object detection datasets validate the effectiveness of the proposed method.Given that the first work on Anchor-Free object detection methods based on keypoint detection achieves good performance by outputting a single heatmap as the overall framework of the model.However,using a single feature map as the output may lead to insufficient focus on pixel information of the input image,resulting in the problem that the long-range dependent information cannot be fully captured.This paper proposes an Anchor-Free object detection method based on efficient feature correlation and long-range pixel focus.The whole network adopts a bidirectional fusion structure for feature aggregation,and the detection performance is effectively improved by enhancing the feature correlation during the detection process and using multi-level outputs.The network mainly consists of the following modules:cross-stage representation enhancement module,feature correlation module,and dynamic weighted convolution.The cross-stage representation enhancement module acquires global multi-scale information from the features obtained by the feature extraction network and allows the information to flow hierarchically through the network structure via skip connections to better capture the contextual information and enhance the features.The feature correlation module is added after each feature aggregation.This module effectively reduces feature redundancy and computational complexity by designing a single-stage terminal aggregation manner and constructing an efficient ESABottleneck,and further strengthens the correlation within features to solve the problem that long-range dependent information can not be adequately captured.The dynamic weighted convolution is applied to the output end and improves the detection performance of the model using matrix factorization and channel dynamic fusion operations.The proposed method is conducted on the relevant public standard detection datasets,and compared with the current relevant methods in various indicators.The comparison results show that the proposed detection method has certain performance advantages and can achieve good detection results.
Keywords/Search Tags:Object Detection, Anchor-Free Networks, Weighted Iterative Fusion, Long-Range Pixel Focus
PDF Full Text Request
Related items