Research On Weight Initialization Method Of Convolutional Neural Networ

Posted on:2023-03-02

Degree:Master

Type:Thesis

Country:China

Candidate:T T Xing

Full Text:PDF

GTID:2568306833465674

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Weight initialization refers to giving suitable initial values to the convolutional neural networks before training to optimize the convergence speed of the model during the training process and its eventual generalization ability.The training goal of the convolutional neural networks is to find the weight-optimal solution.Therefore,a suitable weight initialization method is the basis for faster convergence and better generalization performance of convolutional neural networks.The existing random initialization methods increase the burden of the model training process.The target task and the model structure are specifically required in the transfer learning initialization approach,limiting their application range.In this thesis,starting from the statistical analysis of the weights of the pre-trained model,obtain the distribution law by analyzing the weight distribution,and apply the distribution law to the initialization of the model,thus,weakening the blindness of the random initialization and solving the dependence of the migration learning initialization on the model structure.The main work and innovations of this thesis are as follows:(1)In response to the current situation that random initialization methods lack the guidance of distribution laws,the laws of prior information in the pre-trained model are investigated and new initialization methods are developed accordingly.Since the existing random initialization methods are based on normal and uniform distributions,this thesis fits and tests the distribution characteristics of weights.The experimental results suggested that weights conformed to the symmetric power-law distribution by the normality test based on the Jarque-Bera statistic and the power-law distribution test of linear fit under a double logarithm.By analyzing the change of the power exponent,it is determined that the power exponent of each convolutional layer decreases gradually with the deepening of the number of layers.Based on the obtained law,the probability distribution function in the initialization method is established as an asymmetric power-law distribution function to establish the mathematical model for weight initialization.A weight initialization method based on the law of convolutional layers is developed by combining the change law of power exponents of convolutional layers.(2)Based on the above study,the spatial distribution pattern of the convolutional kernel weights is derived by studying the weight distribution at each position of the convolutional kernel,and the new initialization method is developed by this.By studying the weights of different channels,the same width and height coordinates within the convolution kernel during the convolution process,it is found that the power exponent of the weights of the pre-trained model is related to the position of the convolution kernel from the center to the periphery.A class division method is proposed for the location of the convolution kernel according to the law of decreasing step size.Based on the created symmetric power-law distribution model,a data sampling algorithm combined with the change law of the power exponent of the convolution kernel is proposed,and a weight initialization method based on the position law of the convolution kernel is designed.(3)Set up comparative experiments to test the effectiveness and applicability of the two initialization methods.The commonly used He initialization method is selected as the experimental comparison object,and several network models are selected for comparison experiments on different datasets.Experiments verify that the two weight initialization methods based on power-law distribution improve the accuracy during the first round of the training in the RGB image classification,as well as the optimization on the final accuracy of the model.The conclusions can be drawn that the initialization method based on the convolution kernel is more effective than convolutional layer-based initialization methods.

Keywords/Search Tags:

Convolutional Neural Networks, Weight Initialization, Transfer Learning, Pre-training Models

PDF Full Text Request

Related items

1	Research And Application Of Weight Initialization Of Convolutional Neural Network
2	An Improved Deep Convolutional Neural Network And Its Weight Initialization
3	Research And Application Of The Pretraining Strategies Of Deep Convolutional Neural Netowrk
4	Research On Initialization Method Of Convolutional Neural Networks
5	Radar Target Recognition Based On Transfer Learning
6	Research And Implementation Of Handwritten Objective Question Letter Recognition Based On Convolutional Neural Network And Transfer Learning
7	The Research On The Mothed Of Image Recognition Based On Convolutional Neural Networks
8	Research On Application Of Deep Convolutional Neural Network Models For Feature Extraction And Classification
9	The Application Of A Multi-layers Pre-training Convolutional Neural Network In Image Recognition
10	VLSI Optimizations And Implementations For Convolutional Neural Networks