Font Size: a A A

Gene Feature Selection And Classification Of Cancer Based On Genetic Algorithm And Support Vector Machine

Posted on:2021-01-26Degree:MasterType:Thesis
Country:ChinaCandidate:M Y TangFull Text:PDF
GTID:2404330626965630Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Cancer with its high mortality rate,seriously affects the life and health of residents,causing millions of deaths every year.The early cure rate of cancer is high,but there are no obvious clinical symptoms,leading to a large number of patients died because of the delay of treatment time,so early detection and early treatment is an important criterion for cancer prevention and treatment.At present,cancer screening mainly adopts imaging means.Doctors make diagnosis through observation and analysis of CT and other image data,which makes cancer screening rely heavily on doctors' clinical experience.The high rate of misdiagnosis and missed diagnosis leads to a large number of cancer patients delay the treatment opportunity.Gene chip technology was born with the development of life science,and this molecular level gene expression product detection technology does not rely on the experience of clinical staff,thus attracts the attention of many researchers.However,there are still many difficulties in the application of gene chip data.Due to its high experimental cost and complicated experiments,the data has high latitude,small samples,and high noise,which increases the difficulty of analyzing and using its data.In this paper,the difficulty of analyzing the data of gene chip is proposed to improve the selection and classification of cancer characteristics based on genetic algorithms,the improvement points are as follows:The evaluation function includes individual maximum correlation evaluation,individual minimum redundancy evaluation and individual diversity contribution evaluation,the weighted sum of these evaluations is integrated with the evaluation based on the accuracy of the classifier through the coefficients related to the population algebra.The evaluation function balances population diversity and convergence speed,and can prevent the algorithm from falling into local optimal solution of premature convergence.The mutation operator in the genetic algorithm improves the genetic diversity of the population by randomly introducing genes into the population.In this paper,a mutation operator based on the dominant gene library and the full gene library is designed according to the population characteristics of different stages of the genetic algorithm.The operator selects a gene library to complete the mutation operation with a probability.The selection probability is related to the population algebra,so that the dominant gene can be introduced quickly in the early stage of the algorithm.This improved method balances the randomness and convergence speed of the genetic algorithm.This paper analyzes the population individuals and finds that in the middle and late stages of the algorithm,a large number of redundant individuals appear in the population,and the redundant individuals are not conducive to the algorithm to continue searching for the optimal subset leading to its premature convergence.Based on this problem,this algorithm adds a population deduplication operation to the genetic algorithm,which can remove the duplicate individuals in the population and improve the diversity of the population individuals and genes.
Keywords/Search Tags:Genetic algorithm, Support vector machine, Feature selection, Gene chip, Cancer classification
PDF Full Text Request
Related items