Font Size: a A A

Loan Model Based On Low Quality And Small Sample Data

Posted on:2017-01-06Degree:MasterType:Thesis
Country:ChinaCandidate:Y L LiFull Text:PDF
GTID:2309330485993942Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
In recent years, with the rapid development of the national economy, and the encourage of the national policy, a large number of medium and small-sized enterprises is flowing out. The problem is that the increasing number of small amount of demand for loans. Different from the traditional loan model, Small amount of loans is more flexible. Therefor, how quickly and accurately determine whether a business has a loan demand is essential. This paper mainly aimed at the loan problem of small and medium sized enterprises, based on the analysis process of the enterprise data, the paper discusses how to carry out data analyzing and modelling effectively in the case of small samples and low quality data.This paper is mainly focused on how to carry out data preprocessing, exploratory data analysis and model building in the case that the data is containing few features, coming from low quality of data sources and including small samples.In the process of data preprocessing, we combine the method “strict in and loose out” and the method “loose all“ together to label the loan data. In the process of exploratory data research, we take the methods of single variable analysis under different loan demand. In the process of model building, we use the model aggregation method of conditional voting. In the result, we get a loan demand model with relatively high stability, and the prediction accuracy of the final model was76%. In the process of modeling, the paper uses logistic regression model as the basic model, reducing the risk of over fitting, while in the process of data analysis, we fully consider the modeling objectives and the following model updating.Therefore, through special processing and analysis of the data flow, the model better reflects the small sample of low mass data of small and medium enterprises loan demand. We according to the final model, recommending a number of enterprises, and we got a better feedback.
Keywords/Search Tags:Logistic Regression, data cleaning, data exploration analysis, data aggresion, data visualization
PDF Full Text Request
Related items