Font Size: a A A

Research Of Gene Expression Data Classification Based On Compressive Sensing Algorithm

Posted on:2013-01-21Degree:MasterType:Thesis
Country:ChinaCandidate:C L RenFull Text:PDF
GTID:2214330371978059Subject:Biomedical engineering
Abstract/Summary:PDF Full Text Request
The occurrence of tumor is usually caused by disorders of cell growth mechanism, usually it expressed as some gene mutation or abnormal expression in the cells, and further affecting the expression of other genes, which results in altered expression of some of protein molecules and produces the tumor differences in the tumor pathology and different types in the clinical diagnosis. With the smooth progress of the Human Genome Project which started in the1990s, DNA microarray technology which experienced a rapid development has brought new hope to the clinical diagnosis and treatment of cancer, the gene expression data produced from DNA microarray experiment allows us to analysis and research cancer etiology from the genetic level. But one microarray experiment will generate tens of thousands of gene expression data at the same time, how to analysis and process the vast amounts of data and extract effective biological information raised a new challenge to human. Analysis of Gene expression profiling data is one of the most important branch in the field of bioinformatics. As one of important research methods, the correct classification of different pathological type of cancer has great significance for clinical diagnosis and treatment of cancer. The proposal and development of Compressive Sensing (CS) algorithm has brought new inspiration of processing the high-dimensional gene expression data: If the data can be represent sparsely in another space, then during data classification, feature selection is no longer a difficult problem and a large number of eigenvalues will be the advantages that the algorithm can use. Compressive Sensing has been successfully applied to face recognition currently, and achieved very good classification results. The gene expression data has the same features with face images data: small samples and high-dimension, this paper will use the Compressive Sensing algorithm for gene expression data classification.This paper analyzes and realized the Compressive Sensing algorithm, using it to classify three common datasets namely Gastric cancer, Lung cancer and Leukemia. Firstly, through the K-nearest neighbor algorithm to supplement the missing values of gene expression data and to normalize the data, and then dividing the data into test data sets and training data sets; using the training data sets to construct a over complete dictionary, and the random matrix with Gaussian entries builds the sensing matrix with normal row vectors. The sensing matrix is projected onto test vector and training vector, and the minimum l0-norm solution is computed with more simple l2-norm. The distance in the transform domain between the reconstruction vector and the train vector is employed to determine the class of the test data. And then we have used Compressive Sensing algorithm to achieve gene expression data classificationAfter repeated experiments, we achieved classification of three cancer gene expression data using compressed sensing algorithm proposed in this paper, and eventually get a good classification, the classification accuracy rate is98.4%,99.3%,97%. Experiment results show the compressed sensing algorithm avoids the problem of feature selection and improves the classification accuracy and efficiency.
Keywords/Search Tags:Compressive Sensing (CS), sparse representation, over completedictionary, tumor gene expression data
PDF Full Text Request
Related items