Font Size: a A A

Research On Employee Turnover Prediction Based On Catboost Algorithm

Posted on:2021-05-31Degree:MasterType:Thesis
Country:ChinaCandidate:Q DingFull Text:PDF
GTID:2438330626954360Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
Talents are the core competitiveness of enterprises.Talents are also the important soft resources for the development of all walks of life,which is an important driving force for economic development.With the rapid progress of science and technology,the benefits brought by talents to enterprises are incalculable.Therefore,it can be seen that brain drain does harm to the enterprise,which will bring serious threat to the company’s operating cost and even organizational structure.Therefore,it is very important to manage human resources reasonably and reduce unnecessary brain drain.According to the report of the American Management Association,the cost of recruitment after leaving office is at least 30% of the annual wage income of the post.For the job with skills shortage,the recruitment cost is equal to 1.5 times of the annual salary,which does not include the enterprise loss(loss of customers and key technologies,impact on operation efficiency)and new employee training cost caused by the vacancy.Nowadays,big data has become popular all over the world.It is used in all walks of life with the combination of Internet technology.The emergence of big data makes it possible for people to obtain knowledge through data analysis.Big data makes people realize the thinking transformation from the pursuit of causality to the exploration of correlation to a large extent.With the help of data thinking and suitable machine learning algorithm,this paper analyzes and forecasts employee turnover.In this paper,the cat boost algorithm based on machine learning is applied to employee turnover,so as to build a prediction model for employee turnover.The data of this article is from the open source IBM HR data.Before the establishment of the prediction model,the data were processed,the dirty data were cleaned,the data were standardized and descriptive statistics were carried out,and the smote algorithm was used to over sample the unbalanced data to ensure the validity of the data.In addition,SCAD algorithm is used to filter the variables before modeling.It is proved that the prediction effect of the model is due to the unfiltered effect after the evaluation of ROC(AUC)curve and confusion matrix.Finally,according to the analysis results,some suggestions are put forward for the company.
Keywords/Search Tags:employee turnover, cat boost, smote algorithm, variable filtering, SCAD algorithm, ROC curve
PDF Full Text Request
Related items