Font Size: a A A

Research And Implementation Of Image Recognition Based On Deep Learning

Posted on:2019-06-04Degree:MasterType:Thesis
Country:ChinaCandidate:A G WangFull Text:PDF
GTID:2348330569987585Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In this thesis,the identity residual block and convolutional residual block are designed based on the residual learning theory,and a 50-layer residual network is constructed by using them.The accuracy of the network on the CIFAR-10 data set is only 84.83%.Then this thesis has improved this network.The improved network has a depth of 245 layers and the accuracy rate on CIFAR-10 is as high as 93.99%.The specific improvements are as follows:Based on the idea of residual learning,improvements have been made to both the identity residual block and the convolutional residual block.Improved identity residual blocks can make the network overlay deeper due to design innovations,and no network degradation will occur.The thesis also gives mathematical proof of the rationality of this improvement.The loss function is optimized.The partial loss of the optimized loss function for each weight has been in a reasonable range,minimizing the parameter update anomaly.Modified initialization method.This thesis uses the Xavier initialization method when training a 50-layer residual network.After reading this paper carefully,it was found that its author deduced this formula assuming that the activation function was an identity function.However,the activation function ReLU used in this thesis is not an identity function.Therefore,the Xavier initialization method is modified based on the ReLU function.The new initialization method is to make the weights update more robustly,make the network converge faster,and of course the accuracy rate of the final trained model is higher.After careful observation of the 50-layer residual network and the improved 245-layer residual network before the improvement in this thesis,we can be surprised to find that the 245-layer network is smaller than the 50-layer network parameter,which benefits from the improved overall structure of the network.In this thesis,1×1 and 3×3 convolution kernels are used skillfully.The size of the feature map is adjusted by setting the sliding step size and Padding of the convolution kernel.The number of convolution kernels is also used to control the channel of feature map.These hyperparameters have a great correlation with the network's parameter amount,which will be discussed in detail in the text.At the same time,this thesis is also very particular about the placement of the pooling layer.Because the objects in the natural picture all have local invariant features,that is,the features of two adjacent points in the feature map of the convolution output are usually very similar.In order to combine semantically similar features,we can use the maximum value in a region of the feature map,or the average of this region to represent the characteristics of this region.This can effectively reduce the feature map size without adding additional parameters.This thesis uses MXBoard to display the loss values and accuracy on the training set and verification set in real time during the training of the network to observe the network status.Finally,this thesis also implements a traffic sign recognition system based on improved residual network.
Keywords/Search Tags:deep learning, image recognition, residual learning, residual network
PDF Full Text Request
Related items