Font Size: a A A

Construction And Application Of LASSO Logistic Model For Fat (High) Big Data

Posted on:2023-05-28Degree:MasterType:Thesis
Country:ChinaCandidate:Y XingFull Text:PDF
GTID:2568306614487444Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
With the significant improvement of data collection and storage capacity,large capacity data and high-dimensional data frequently appear in many scientific research fields.The generation of big data and the wide use of machine learning based on big data have a profound impact on statistics.At present,stochastic forest algorithm,which integrates various machine learning algorithms,is usually used to solve tall big data problem whose main feature is that the sample size exceeds the sample dimension.This approach is not only complicated but also has poor explanatory power.The purpose of this paper,starting from the problem of poor prediction effect caused by poor data quality which is generally faced in the field of big data,is to construct a sparsity model with the advantages of simple operation steps,high prediction accuracy and strong interpretation ability for the two types of big data--"fat big data" and "tall big data",and extend it to credit research and other fields.By comparing the advantages and disadvantages of LASSO Logistic model,Group LASSO Logistic model and HierNet LASSO Logistic model,aiming at the big data problem that the interaction between variables has an important impact on the production of results,this paper constructs a sparsity model with strong explanatory power and high prediction accuracy.And this paper introduces it into the research of personal credit default in the credit field for the first time,which effectively solves the problems of complicated steps and low prediction accuracy of traditional modeling methods.Especially for the poor prediction effect caused by small sample size,high dimension and less data information,it can deeply mine the correlation contained in the data and significantly improve the prediction effect.Moreover,the model is easy to operate and can be extended to the research of problems with similar data characteristics,which has strong practical significance.
Keywords/Search Tags:LASSO Logistic model, Group LASSO Logistic model, HierNet LASSO Logistic model, Sparsity, Fat big data, Tall big data
PDF Full Text Request
Related items