Font Size: a A A

Research And Application Of Online Learning And Distributed SCAD Algorithm

Posted on:2021-11-14Degree:MasterType:Thesis
Country:ChinaCandidate:B LiFull Text:PDF
GTID:2481306095469394Subject:Statistics
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology,it is possible to col-lect high-dimensional data in many scientific fields,and machine learning theory is becoming more and more perfect.Training machine learning algorithms based on observation data is widely used in various disciplines.But in the era of big data,the amount of data tends to grow geometrically.In the face of such a large amount of data,after the data is stored and organized,a natural problem is how to design a new learning algorithm suitable for big data.The rise of online learning algorithms and distributed algorithms in recent years has pro-vided new processing and analysis methods for solving such problems.Online learning solves the shortcomings of offline learning data training,which is dif-ficult to analyze the real-time data stream.Distributed algorithms are easier to store and process massive high-dimensional data than centralized computing,while greatly reducing computing time and improving analysis efficiency.We focus on the research of big data processing methods,including the study of classification problems based on logistic regression online learning and the re-search of machine tool operation process energy consumption prediction based on distributed SCAD.The content of each chapter of this article is arranged as follows:The first chapter briefly describes the significance and research background of studying online learning algorithms and distributed algorithms in big data processing methods.In the second chapter,we study online logistic regression with regular terms.We propose an online logistic-l2regression model,and give an estimate of the regret bound based on online learning theory.Experiments with simulated data and real data show that the proposed model and algorithm can achieve the classification results of offline prediction and effectively solve the classification problem of continuous data streams.In the third chapter,variables are selected based on the ADMM algorithm and non-convex regularized SCAD penalties.Taking the energy consumption of machine tool process operations as a case,the corresponding distributed algo-rithm is designed to establish the energy consumption model.Experiments with real data on CNC machine tools show that the proposed model and algorithm can effectively predict the energy consumption of CNC machine tool operation,and provide a new and effective method for industrial production energy analysis.Finally,in the fourth chapter of the article,the main work of the paper is summarized,and several directions for future research are given.
Keywords/Search Tags:Machine learning, Online learning, Distributed algorithms, Logistic regression, SCAD
PDF Full Text Request
Related items