Font Size: a A A

Research On Variance Reduction Gradient Algorithm For Large-scale Data In Machine Learning

Posted on:2021-03-04Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhuFull Text:PDF
GTID:2428330614961442Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The rapid development of technology allows people to obtain a large amount of data,which contains important information and various noises.How to obtain useful knowledge from data is the most important thing at this stage of machine learning.In machine learning,mathematical optimization is one of the foundations,involving almost all aspects of the subject.Mathematical optimization in machine learning is mainly to solve the problem of empirical risk minimization.The problem of empirical risk minimization is to minimize the expected error of the result predicted by the function model.Solving the empirical risk minimization problem itself is a big problem in the entire machine learning field,so designing a fast and efficient optimization algorithm has become the goal that researchers have been pursuing.Aiming at the mathematical optimization in machine learning,this thesis first proposes an optimization algorithm for solving the empirical risk minimization problem,and at the same time studies the structural risk minimization problem in machine learning.The main research results of this thesis include the following aspects:(1)Batch Subtraction Update Gradient(BSUG).In order to deal with the problem of inability to linearly converge due to the variance introduced by the stochastic gradient,we improved on the existing variance reduction algorithm and designed a new variance reduction algorithm.In the training process,BSUG uses a small batch of samples instead of all samples to calculate the average gradient,and at the same time performs a subtraction update to the average gradient.The experimental comparison with other variance reduction algorithms shows that the convergence speed of the BSUG algorithm reaches linear convergence.In order to explore the stability of the algorithm,the relationship between the adjustment parameters and the algorithm training effect are compared.Experiments show that the BSUG algorithm has a certain degree of stability,and can achieve sufficient training accuracy even in the case of low batches.(2)Proximal Batch Subtraction Update Gradient(Proximal Batch Subtraction Update Gradient,Prox BSUG).In order to solve the structural risk minimization problem,the Prox BSUG algorithm adds a regularizer to the objective function on the basis of the BSUG algorithm.At the same time,in order to accelerate the convergence speed,it uses the proximity operator calculation in the weight update process.Experimental comparison shows that Prox BSUG algorithm has good training effects on different data sets.Finally,by constructing the structural risk minimization problem,and comparing with other optimization algorithms,the experiment shows that the Prox BSUG algorithm has achieved good experimental results in solving the structural risk minimization problem.
Keywords/Search Tags:Machine learning, Algorithm optimization, Subtraction update, Stochastic variance reduction gradient, Empirical risk minimization
PDF Full Text Request
Related items