Font Size: a A A

Research On Unsupervised Feature Selection Based On BDE-MICI

Posted on:2020-06-26Degree:MasterType:Thesis
Country:ChinaCandidate:Q ZhuFull Text:PDF
GTID:2417330596986782Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
With the advent of the age of big data,many high-dimensional and unlabeled data have appeared in today's real life,such as data generated in the medical and financial fields.For big data sets,we will find that some features are redundant features or are highly related to other features.Therefore,when we preprocess the data,we often remove these redundant features and noisy features.This is indispensable for further learning.Based on whether the sample in the original set contains class label,feature selection can be divided into two methods: supervised and unsupervised.Most of the data in reality is without class label.The research and application of unsupervised feature selection algorithm has become a hot research issue today,and it is an important position that can't be replaced in the processing of unlabeled data.This thesis makes an research and analysis on the unsupervised feature selection problem.In this thesis,based on the principle of binary differential evolution algorithm and maximum information compression index,an unsupervised feature selection algorithm based on binary differential evolution and maximum information compression index is proposed.The algorithm utilizes the adaptive function constructed by the properties of the maximum information compression index as the evaluation criterion of the candidate subset,which is used to reduce the redundancy feature and the irrelevance feature in feature selection.By comparing and analyzing the existing differential evolution algorithms,the real number coding method is changed to 0-1 coding mode,which makes the binary differential evolution algorithm have both the optimization speed of differential evolution and the simple operation of feature selection.We introduce a self-regulating mutation operator in binary differential evolution,avoiding the premature phenomenon and increasing the probability of searching for the global optimal solution.Through performance comparison analysis in seven different types of data sets,the results show that the improved algorithm is superior to other existing four unsupervised feature selection algorithms and GA-MICI algorithm.
Keywords/Search Tags:feature selection, differential evolution, binary, maximum information compression index
PDF Full Text Request
Related items