| With the continuous development of deep learning in the field of computer vision in recent years,object detection technology has become increasingly mature and has been widely used in medical,transportation,education and entertainment fields,bringing great convenience and advantages to human life.Logo detection,as an important branch of object detection,has important research significance in commodity image retrieval,infringement detection,and intelligent transportation.Commodity image retrieval finds a desired product image in a huge database through commodity information;infringement detection strengthens brand rights protection and service guarantee by detecting the logo in the image;intelligent transportation recognizes traffic indicators and accurately detects traffic signs in front of the road,thereby standardizing Driving behavior and traffic order.Compared with ordinary object detection task,the Logo targets in real scenes are volatile in shape,the larger number of small targets,and the complex background of the Logo image.These interference factors restrict Logo recognition effect.Most of the traditional detection technologies take the research of a single Logo target in the image as the starting point,resulting in a low accuracy of the detection results.The wide application of deep learning in computer vision provides new ideas for the research of Logo detection.Based on the above background,our research selects Logo target as the research object in real scene,aims to improve the accuracy of detection recognition,and builds a Logo detection model based on deep learning.The main work of this thesis are as follows:(1)Based on the analysis of Logo data,it is concluded that the size of Logo targets varies greatly in detection task,and most of them are small targets.This thesis proposes a Logo detection model(SA-Net)to optimize the Res Net by integrating Self-attention mechanism.Res Net,as the backbone network of the detection model,only has limited features obtained through convolution operations,and has low detection performance for small targets.To solve this problem,this thesis introduces a Self-attention mechanism in the backbone network.The Self-attention mechanism strengthens the focus on small target regions,which is beneficial to obtain richer feature information.Using the Recursive Feature Pyramid(RFP)in feature extraction stage,the Feature Pyramid(FPN)is recursively enhanced to generate powerful feature representations.Furthermore,Quality Focal Loss is introduced to better learn the joint representation of classification and Io U values,aiming to resolve the inconsistency between training and testing phases,and DIo U loss function is introduced to speed up the convergence of the training model for obtaining more accurate regression results.(2)The vast majority of Logo detection models are built on a single-format object representation,such as rectangular boxes in Faster R-CNN,center points in FCOS,and corners in Corner Net.Due to heterogeneous and off-grid feature extraction of different representations,it is often difficult to combine them into one model to fully exploit the strengths of each representation.Based on the SA-Net method,this thesis introduces the existing attention decoder module after the feature extraction stage,and connects different representations into the detection method of a single representation format to obtain SAAD-Net,which improves the detection accuracy of the model.Among them,the input is the main query representation feature,and the other representations act as a set of key instances to reinforce the query representation.In addition,center point pooling and corner point pooling are used after feature extraction to select center points and corner points suitable for classification and detection.At the same time,the shared location embedding technology is used to improve the detection robustness of the model in real-world scenarios and reduce the computational complexity of the model.In summary,for the Logo detection method proposed in this thesis,a large number of experimental analyses have been carried out on the Logo Det-3k dataset and public datasets(Flickr Logos-32 and QMUL-Open Logo).The experimental results show that compared with the current more advanced object detection methods based on deep learning,the method in this thesis has extremely obvious advantages. |