Font Size: a A A

Unsupervised Learning Environment Attribute Selection Applied Research

Posted on:2006-08-29Degree:MasterType:Thesis
Country:ChinaCandidate:J X ZhuFull Text:PDF
GTID:2209360182997881Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
The main idea of feature selection is to choose a subset of all variables by eliminating features with little or no discriminative and predictive information. The result of feature subset shows better performance in modeling than that of all features. Feature selection, as a preprocessing step to data mining, is effective in reducing dimensionality, removing irrelevant data, increasing learning accuracy, and improving result comprehensibility.According to various of applications of the datasets, feature selection algorithms can be categorized as either supervised learning or unsupervised learning feature selection approaches. The distinction between"supervised"and"unsupervised"for data mining methods comes from the classification problem. If methods use a training data set with correct classifications for learning specific predictive patterns, they are called"supervised". If we just use the data itself to and internal structure, the method is called"unsupervised".As we all know, the methods of feature selection for supervised learning perform pretty well with strong practice and simple operation. The typical ones include Relief-F, Information Gain and Chi-Square etc. Feature selection was considered as feature selection in supervised learning from traditional view. However, as data mining has penetrated to more application domains, feature selection for unsupervised learning becomes concerned increasingly.Therefore, without any information of classification from samples, method of feature selection cannot result in satisfying effect. The point of paper is to make a deep survey on feature selection for unsupervised learning, which can provide some valuable practical experience of enhancing efficiency of data mining for unsupervised learning.Firstly, making a comprehensive survey of feature selection for unsupervised learning in the past research, these are theoretical foundations of my paper. Secondly, we introduce a novel methodology ULAC (Feature Selection for Unsupervised Learning Based on Attribute Correlation Analysis and Clustering Algorithm). Then we verify the efficiency, significance and applicability of ULAC model by experiments. Efficiency analysis is to prove ULAC an efficient model itself. Significance analysis is...
Keywords/Search Tags:Unsupervised Learning, Feature Selection, ULAC, Data Mining
PDF Full Text Request
Related items