| Modern enterprise management has gradually become a people-centered management.In the era of knowledge economy,talent resources have gradually become the most important resources in the development of enterprises,and it is crucial for enterprises to retain talents if they want to occupy one seat in their business fields.However,many enterprises are currently facing abnormally high employee turnover,which seriously hinders the growth and development of enterprises.In order to avoid brain drain and prevent the adverse impact of frequent talent flow on enterprises,we can use the big data research methods in machine learning to study and analyze the current situation of brain drain from enterprise employee turnover information,explore the key factors that significantly affects the employee turnover behavior,and we build the employee turnover prediction model for the reference of the enterprise human resources department to take timely measures to intervene and prevent brain drain.Employee turnover refers to the process of employees leaving the company voluntarily according to their own will.In this thesis,we use the employee turnover dataset on the Kaggle platform to predict employee turnover.The feature variables mainly include the basic identity information of employees,job information of employees,salary and welfare,quality of life,etc.This dataset has no missing values and the quality of the data is relatively high,so some data pre-processing work can be eliminated.In this thesis,firstly,we conduct exploratory analysis on the dataset,mainly by drawing spinogram,boxplot and column diagram for visual display,and initially explore the correlation between the feature variables and the target variables,followed by pre-processing the data,including imbalance processing,standardization processing and feature coding,in order to facilitate the subsequent application of machine learning methods for prediction research on the data and improve the prediction accuracy and prediction of the model efficiency.In the empirical analysis part,for the problem of high dimensionality of the feature variables,We mainly use Lasso for dimensionality reduction,and then the support vector machine,decision tree,random forest and XGBoost model are constructed using the filtered variables,and the grid search method and cross validation method are used to find the optimal parameters of some models to further improve the prediction effect of the models,and it is finally found that support vector machine model has a better prediction performance,and the prediction accuracy is up to 92%.Finally,the specific impact of the feature variables is analyzed by combining the importance ranking of the features output from the machine learning model,and we find that overtime is the most important variable in employee turnover prediction,followed by the high degree of importance of variables reflecting job benefits such as stock option level and monthly income.In response to these findings,this thesis provides suggestions for this company to avoid employee turnover,and also expects to provide new ideas for other companies to reduce their employee turnover rate.Finally,we summarize and outline the research work.It mainly describes the research content of this thesis,explains the prediction effect of the machine learning model,in addition,points out the shortcomings of this thesis’s research,and provides an outlook for further research to follow. |