Font Size: a A A

Research On Binary Classification Machine Learning Model For Order Multinomial Feature Variables

Posted on:2023-10-16Degree:MasterType:Thesis
Country:ChinaCandidate:X X ZhouFull Text:PDF
GTID:2558307073986909Subject:Statistics
Abstract/Summary:PDF Full Text Request
With the evolution of network technology and data storage technology,big data has also emerged.As an important tool for processing big data,machine learning methods have received extensive attention from many scholars in various fields.In recent years,machine learning has been diffusely applied in various fields.It plays an significant role in binary classification tasks.and its applications are becoming more extensive and more mature.However,due to the fact that data involves a wide range of fields and various forms of data,most scholars are often too mechanical when using machine learning methods to process various data,and do not do special processing for some special data.Therefore,it still needs to be improved when we use machine learning methods to process some special data sets.Ordinal multinomial feature variables as a kind of special data widely exists in the fields of finance,etc.It may lead to large deviations of prediction and over-fitting of the model by directly modelling and predicting the response variable without some pre-treatment on the feature variables,due to the existence of irrelevant feature variables and pseudo categories of ordinal multinomial feature variables.In view of this,for data containing ordered multiple feature variables,this paper proposes a novel improved method which is called the improved algorithm Rank-TD-Classification to deal with ordinal multinomial feature variables for machine learning classification model.It uses the Jackknife method for estimating the mutual information(JMI)and the Spearman’s-Rank-correlation to sort the feature variables respectively,and then filter out the more important feature variables.Then combines with the pseudo-category identification and fusion method of ordinal multinomial feature variables,and finally combines with machine learning methods to classify and predict the data after pseudo-category identification and fusion,so that the established machine learning classification model is more effective and practical.Finally,through comprehensive analyses for a data set of bank credit and a survey data set of student performance,the practicability and effectiveness of the proposed method RankTD-Classification has been demonstrated.It also provides a new breakthrough for machine learning researchers in improving model performance when processing data with many ordinal multinomial feature variables.
Keywords/Search Tags:ordinal multinomial, feature selection, pseudo category, machine learning
PDF Full Text Request
Related items