Font Size: a A A

Research On High-dimensional Biomedical Feature Selection Algorithm Based On Intelligent Algorithm

Posted on:2020-03-29Degree:MasterType:Thesis
Country:ChinaCandidate:J J MaFull Text:PDF
GTID:2404330575492717Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the widespread use of gene chip technology in the medical field,a large number of microarray data are rapidly accumulated.These data are analyzed and build effective classification model,which has important significance and value of some of the early clinical diagnosis and treatment of potential patients.However,high-dimensional small samples of gene microarray data,such as the colon microarray dataset,contain more than two thousand genetic features.Faced with such large-scale microarray datasets,experts cannot directly analyze and diagnose in a short period of time.In addition,most genetic data usually contain some redundant or noisy data,which may lead to misclassification of the modeling of disease diagnosis algorithms and over-fitting when training time is too long.As an effective way of dimensionality reduction,feature selection in the biomedical field has attracted wide attention and become a hot topic in recent years.Feature selection technology is a key step in the proper analysis and classification of microarray genetic data.Without proper feature selection methods,it is difficult for existing models to accurately capture important information.Essentially,the feature selection problem can be viewed as a dual-objective optimization problem,which optimizes feature subsets while maintaining or improving prediction accuracy.At present,there are some feature selection methods for microarray biomedical data.Wrapper based feature selection methods aim to obtain higher classification accuracy in the search process and attracts more and more researchers.As the search strategy is the most important step in the Wrapper method,the meta-heuristic search based on the population mechanism is widely used in the Wrapper method to find the best feature subset to improve the classification performance.In order to improve the search performance based on Wrapper method,this paper improves the different types of intelligent algorithms to select features in highdimensional biomedical data sets.The main research is as follows:1.A feature selection strategy based on improved Clonal Flower Pollination(IBCFPA)was proposed.The clonal flower pollination algorithm CFPA is updated by the Levy flight formula and selfpollination.In order to further improve the search performance of CFPA,an absolute balance grouping strategy was introduced.The current optimal solution searched by the clonal flower pollination algorithm is cloned to form a new population and grouped.First,local update within the group is performed,and then global update between groups is performed.The adaptive optimal Gaussian mutation operation is used to improve the current optimal solution,and a supervisory mechanism is set to determine whether the searched optimal solution falls into local optimum.The experimental results show that compared with other intelligent algorithms,IBCFPA can efficiently select the best genes for higher classification accuracy.2.A feature selection strategy based on improved Coral Reef Optimization Algorithm(BCROSAT)is proposed.The Coral Reef Optimization Algorithm CRO is a group intelligence algorithm that updates individuals by simulating the reproductive and evolutionary processes of coral reef larvae.First,each coral larva is modeled as a two-dimensional vector during the initialization process to construct an initial coral population.Then select a solution to replace the initial population from the worst solution of all coral by tournament selection strategy,not only to enhance the diversity of initial population,but also improve the quality of the initial solution.In order to enhance the local search ability CRO algorithms,simulated annealing algorithm SA as a local search algorithm CRO operator.The experimental results show that the search performance of BCROSAT is better than IGA and MPSO.In order to verify the performance of the BCROSAT,different classification algorithms KNN,SVM and ELM were used in conjunction with 10-fold cross-validation to evaluate the classification accuracy of the algorithm.3.A feature selection strategy based on enhanced Wrapper model is proposed.In view of the fact that the Filter method can filter high-dimensional data at a faster speed,a feature selection strategy combining Filter and Wrapper model is proposed to improve the performance of classification.Based on the research of the above two types of Wrapper-based intelligent algorithms,combined with the chi-square detection of the filter method,Chi-IBCFPA based on Chi-square and IBCFPA and Chi-BCROSAT based on Chi-square and BCROSAT were proposed.A dual-population initialization strategy is constructed during the initialization process,with a portion of the initial individuals being pre-processed by Chi-square detection and another portion being randomly initialized.The experimental results show that the proposed Chi-IBCFPA and ChiBCROSAT performance is significantly better than the hybrid model IG-GA and IG-PSO.The combined Wrapper algorithm can search for the best feature subsets more efficiently.
Keywords/Search Tags:Feature selection, Intelligent algorithm, Clonal Flower Pollination Algorithm, Coral Reef Optimization Algorithm, Chi-square detection, Enhanced Wrapper model
PDF Full Text Request
Related items