Font Size: a A A

Application Of Supervised Machine Learning On Poverty Identification Of Rural In Gansu Province

Posted on:2019-02-15Degree:MasterType:Thesis
Country:ChinaCandidate:J R LiFull Text:PDF
GTID:2439330575952131Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
With the change of the macro economic environment,especially the level of national income distribution inequality,the former extensive in regional development of poverty alleviation mode couldn't satisfy the current situation any longer.At the same time,with the advancement of poverty alleviation and development work and increasing poverty relief,rural areas and poverty population which are easy to get out of poverty have been done so,for the other poverty areas and population,the cause of poverty and poverty properties showed obvious complexity.Based on the current rural poverty,"the central committee of the communist party of China about making the 13 th five-year plan for national economic and social development suggestion" was proposed through the communist party's fifth plenary session of the eighteenth convention.It sets the poverty alleviation and development goals,that by 2020 the "under the current standards in China,rural poverty population will get out of poverty,poor counties all will get out of poverty,and to solve the regional overall poverty",aiming at the target and the past vulgar poverty alleviation,it proposed a series of accurate poverty alleviation policy,in order to precisely recognize poverty causes and precisely assist poverty alleviation.Gansu province is a traditional agricultural province,is a less developed provinces in western China,which holds the higher poverty rate in China,under the precise target for poverty alleviation,poverty alleviation mission of Gansu province is extremely hard.To achieve this goal,first of all is to identify poverty accurately.This study is based on the 86 counties in Gansu province the household survey data,using four machine learning methods,namely,the decision tree classification model,random forest model,the classification of the logistic regression and neural network model,separately carries on the classification of poor empirical research.The main work of the empirical process is as follows: one is to comb and clean the data,and to make an exploratory analysis to determine whether the sample is marked as a poor household according to whether it is a low insured.Two is to introduce four kinds of supervised machine learning methods used in the empirical process,and point out its advantages and disadvantages,and compare the results for subsequent identification.The three is to classify the sample data with four different models,and to evaluate the results of the four models with the accuracy of classification,the mixed matrix and the ROC diagram.The empirical results show that random forest model to the poor classification accuracy is relatively optimal.Hope the study could provide valuable reference to the poverty alleviation work in Gansu province.
Keywords/Search Tags:Recognition of poverty, Decision Tree Classification, Random Forest, Logistic Regression, Neural Network
PDF Full Text Request
Related items