Font Size: a A A

A Category-based Probabilistic Approach To Feature Selection

Posted on:2019-12-02Degree:MasterType:Thesis
Country:ChinaCandidate:J G DaiFull Text:PDF
GTID:2370330548473776Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
In the big data era,all kinds of data is growing rapidly.However,not all of the data is valuable,or there is some noninformative features which bring some difficult to the data analysis and decision making.So we need to extract useful information features from large amounts of data,feature selection becomes the important way to deal with this problem.namely,extracting effectively features from original features to reduce the dimension of data.In addition,A high dimensional and large sample categorical data set with a response variable may have many noninformative or redundant categories in its explanatory variables.Identifying and removing these categories usually improve the association but also gives rise to significantly higher statistical reliability of selected features.In this thesis,a category-based probabilistic approach is proposed to achieve this goal.Supportive experiments are presented.
Keywords/Search Tags:feature selection, categorical data, dummy variable, information reliability
PDF Full Text Request
Related items