Font Size: a A A

Research On Efficient Derivative-Free Automatic Machine Learning

Posted on:2022-10-06Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y Q HuFull Text:PDF
GTID:1488306725471294Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Machine learning applications need to specifically configure learning pipelines and hyper-parameters.Previously,the configuration is mostly completed manually,which consumes human labor and relies on expert knowledge.It has a high-cost and hinders machine learning from expanding its application scope.Automatic Machine Learning(AutoML)aims to automatically configure machine learning pipelines,which effectively reduces the cost and the threshold of applications.AutoML including algorithm selection and hyper-parameter optimization usually faces a complex optimization problem of non-differentiability,high-dimensionality,and non-convexity.Derivativefree optimization doesn’t require a differentiable objective function,so it is suitable for solving this problem.However,the utilization of derivative-free optimization in AutoML often leads to problems of sampling redundancy,large search space,low convergence rate,and high evaluation cost.This paper researches on above problems and proposes new methods.Then,a neural architecture search system is developed based on the proposed methods.The main results are as follows:1.For the sampling redundancy problem of derivative-free optimization,proposing the sequential derivative-free optimization method.Most derivative-free optimization methods use the redundant batch-based random sampling.However,the evaluation of AutoML tasks has a high cost.The evaluation of redundant samples is the main reason causing a high time cost when optimizing.Can we reduce the number of samples in each iteration to improve the optimization efficiency?This paper proposes the sequential derivative-free optimization method SRACOS,in which,only one sample is done in each iteration.The theoretical analysis shows SRACOS has smaller sample complexity than the batch-based method.SRACOS outperforms the batch-based method in 9 reinforcement learning direct policy search tasks and achieves an average improvement of 23%.2.For the huge search space problem of hyper-parameters,proposing the cascaded algorithm selection method.AutoML usually jointly considers all hyperparameters of all learning algorithms,which forms a huge search space.Can we separately consider the algorithm selection and hyper-parameter optimization to reduce the search space that each optimization process faces?This paper proposes the cascaded algorithm selection method,which has a two-level process:the lowerlevel is to optimize hyper-parameters for each learning algorithm;the upper-level is a Multi-Armed Bandit(MAB)problem,which considers how to allocate the limited time resource to the lower-level optimization processes.For the target of finding the best configuration in upper-level,this paper proposes the Extreme-Region Confidence Upper Bound(ER-UCB)MAB algorithm and proves it has the logarithmic and sub-linear accumulated regret upper bounds separately on stationary and nonstationary feedback distributions.Combining ER-UCB and SRACOS to develop the cascaded method,ER-UCB has successfully found the best learning algorithm on total of 6 datasets and improved 1.2%performance on average.3.For the low convergence rate problem of derivative-free optimization,proposing the experience-based derivative-free optimization method.Derivative-free optimization only uses the evaluation values to explore the search space.The optimization process converges slowly when solving complex problems,such as AutoML.Can we use the information in historical optimization processes to accelerate the convergence rate of derivative-free optimization?This paper proposes the experience-based derivative-free optimization method EXPSRACOS,which reduces the non-directivity of optimization on new tasks by introducing the experienced optimization directions and improves the convergence rate.Further,for the problem of mismatch between historical and current tasks,this paper proposes ADASRACOS algorithm to adaptively select experience.On 39 of 40 datasets,the proposed method outperforms other compared methods with only 30 samples.4.For the high evaluation cost problem of hyper-parameters,proposing the multifidelity derivative-free optimization method.The evaluation for hyper-parameters includes the training and validating processes,which has a high cost and hinders the fast iteration of derivative-free optimization.Can we introduce the fast but lowaccurate low-fidelity evaluation to reduce the total evaluation time cost when optimizing?This paper proposes the multi-fidelity derivative-free optimization method TSESRACOS,which utilizes the fixed low-fidelity evaluation by a residual predictor to replace a part of the high-fidelity evaluation and designs a transfer series expansion way to efficiently train the residual predictor.On the hyper-parameter optimization tasks of 12 datasets,TSESRACOS achieves similar performance to the high-fidelity optimization with only about 32%of average time cost.5.Designing the competition-based derivative-free neural architecture search system.The deep learning neural architecture search is a special AutoML task,which has a large search space and extremely high evaluation cost.In business scenarios of Huawei,the system requires that search results should have not only good performance but also constraints on parameter scale and calculation operators.Based on the above-proposed derivative-free AutoML methods,this paper designs the competition-based derivative-free neural architecture search system CNAS.CNAS considers the network topology search and calculation operator optimization in cascade and combines them by a competition mechanism.At the same time,CNAS utilizes the search experience to warm-start both processes for further improving the search efficiency.Comparing with manual network architectures and other search methods(ENAS and DARTS),CNAS averagely achieves 4.1%accuracy improvement on image classification tasks and 1.07 PSNR improvement on image denoising tasks.Meanwhile,the obtained network architectures have fewer parameters,which meets the requirements in business scenarios of Huawei.
Keywords/Search Tags:machine learning, automatic machine learning, derivative-free optimization, algorithm selection, hyper-parameter optimization, neural architecture search
PDF Full Text Request
Related items