| In recent years,deep learning has achieved good performance in solving problems such as computer vision,speech recognition and natural language processing.Artificial Neural Network(ANN)is fundamental mathematical models in deep learning.Most of the optimization algorithms in ANN training are based on gradient descent,but when there is not enough training data or overtraining,it often leads to overfitting.A commonly used method to avoid overfitting is regularization,where L1 regularization is the sum of the absolute values of accumulated weights in the loss function,which makes a nondifferentiable part in the loss function.In this study,we study how to train ANN with L1 regularization loss function.The specific work is as follows:1.analyze the forward and backward propagation in ANN using L1 regularization loss function,and deduce how to solve the partial derivatives of each parameter.2.the iterative shrinkage threshold algorithm(ISTA)using the proximal operator is applied to the training of ANN.Combined with batch technique,a stochastic iterative shrinkage threshold algorithm(SISTA)suitable for ANN training is proposed,and the process of training ANN using SISTA is given,and it is proved that the algorithm converges linearly in sequence.3.Numerical experiments on solving the minimum value of unconstrained problems using ISTA and numerical experiments on training ANN are carried out,and a comparative analysis with SGD is carried out.The choice of hyperparameter learning rate and regularization coefficient is studied.The feasibility of the random iterative shrinkage threshold algorithm in the training of ANN is verified by experiments.In general,compared with the SGD,the SISTA proposed in this paper can improve the accuracy of ANN classification to a certain extent,and at the same time can better suppress the phenomenon of overfitting. |