| In recent years,with the vigorous development of deep learning technology,it has promoted the growth of intelligent retail stores,unmanned convenience stores,and other “new retail” methods.In these application scenarios,commodity recognition algorithms based on computer vision have become one of the key technologies.The commodity recognition task,as application of object detection algorithms in the “new retail” scenarios,faces many challenges and difficulties,such as the diverse range of commodity types and significant differences in size and appearance between different products.Additionally,due to the influence of image acquisition devices,there are problems such as fisheye distortion in the commodity images captured,which limits the accuracy of commodity recognition algorithms.This dissertation focuses on the problems encountered in the commodity recognition process and conducts research on commodity recognition algorithms.The main work and improvements are as follows:Firstly,the two-stage general object detection algorithm Cascade R-CNN is used as the basic algorithm for commodity recognition.This thesis analyzes the network structure of the algorithm,and,to address the deformation of products in the images,deformable convolution is introduced into the algorithm.The improvements and innovations of deformable convolution compared to ordinary convolution in dealing with deformed targets are discussed in detail,and the impact of deformable convolution applied to different stages of the Cascade R-CNN algorithm’s backbone network on commodity recognition performance is analyzed.Through comparative experiments,it is confirmed that introducing deformable convolution into the backbone network can enhance its feature extraction capabilities,particularly for deformed or distorted products.This improvement can better learn the degree of deformation of the object and improve the overall detection performance of the two-stage object detection algorithm Cascade R-CNN.Actual experimental data show that introducing deformable convolution into specific stages of the Cascade R-CNN backbone network can improve detection performance by approximately 1%.Secondly,for embedded devices and mobile terminal devices with weaker computing performance,the two-stage object detection algorithm is not suitable due to its large computational load and limited computing resources.Therefore,this dissertation selects the YOLOv5 algorithm from one-stage object detection algorithms as the basic algorithm and incorporates the attention mechanism,embedding the CBAM attention mechanism module.The CBAM module can calculate the weights of feature maps in the channel and spatial dimensions,enhancing input feature maps using the calculated weights to improve the feature representation of target objects and suppress the representation of unrelated information.By integrating the attention mechanism into the YOLOv5 algorithm,the detection performance of the YOLOv5 algorithm is improved.At the same time,the embedded CBAM attention mechanism module has a simple structure and does not significantly change the network structure of the YOLOv5 algorithm,striking a good balance between the accuracy of commodity recognition and the complexity of the detection model.Experimental results show that the improved YOLOv5-CBAM algorithm achieves an approximately 0.7% increase in detection performance. |