Font Size: a A A

Gene Expression Profile Data Classification Based On Support Vector Machine And Genetic Algorithm

Posted on:2017-07-08Degree:MasterType:Thesis
Country:ChinaCandidate:T T YuanFull Text:PDF
GTID:2334330512970515Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Cancer is a big problem in human health,so the classification of cancer gene profile data plays an important role in bioinformatics research especially with the development of gene chip technology.As the gene expression profile data is high dimension,small sample and contains a large number of redundant genes and noises,thus the classification is computational complicated and ineffective.Therefore,it is of practical significance to research and design an effective and efficient gene expression profile data classification algorithm.In this paper,we first calculate the B value for each gene attribute according to the Bhattacharyya distance,and sort out the previous part of the genetic attributes to initially descend dimension by feature selection.Then,we propose a principal component linear discriminant genetic(PCLDGA)algorithm to extract the feature from the initial dimension reduction data and descend dimension again.Finally,we use the training model which optimized by support vector machine(SVM)parameters based on genetic algorithm(GA)to classify the reduced dimension data.The experimental results show that this method improves the classification accuracy of cancer genes.On multi-core CPU structure,we propose a new parallel method of SVM parameters optimization based on GA to accelerate genetic classification process.The main idea of this method is that the initial population is divided into multiple small populations which can be evolutionary computation on some workers independently at the same time.The best individual genes are passed down and the best genes of all small populations are integrated for next evolution,namely,combine the excellent individual survived in the evolutionary computation into new species to maximize the fitness and get the SVM parameters optimization combination.At the same time,the operations for cross validation-leave one are processed in parallel.The experiment shows the parallel algorithm of gene expression profile data classification is efficient.
Keywords/Search Tags:gene expression profile data, classification, dimension reduction, genetic algorithm, support vector machine, multi-core CPU parallel computation
PDF Full Text Request
Related items