Tumor Gene Expression Data Classification Based On Manifold Learning | | Posted on:2011-04-05 | Degree:Master | Type:Thesis | | Country:China | Candidate:F L Wu | Full Text:PDF | | GTID:2204360305486076 | Subject:Control theory and control engineering | | Abstract/Summary: | PDF Full Text Request | | Tumor is one of major diseases that affecting human health. However, at present, tumor diagnosis and treatments need to be improved. Compared with conventional method, the molecule diagnosis method based on gene expression profiles is more accurate. It can detect the progression and deteriorating degree of the tumor or the tolerance of the anti-cancer drug and so on, which can offer the clinical doctors an important reference for diagnosing the tumor type, providing treatment programs and analyzing prognosis. At present, the microarray data with the charecteristics of high dimension and small sample continues to accumulate. How to obtain useful information or law from these high-dimensional datas effectively has become one of the problems needed to be solved urgently in the field of information science and technology.However, it's very difficult to select the feature genes which have a good classification capability and small quantity from thousands of genes in the gene expression profile. Usually, it is impossible to apply an confined search in such a large gene space. So it's very important to select a suitable feature extraction method.In this thesis, we applied a new feature extraction method using manifold learning algorithm. Then we make a research and comparison among the two-class or multi-class classification problems by the method and some manifold learning algorithm. Lastly, we conduct a study and ategory-comparison on the cross-platform tumor data by CMVM (Constrained Maximum Variance Mapping) and LLDE (locally linear discriminant embedding) algorithms.The main researches of this thesis are described as follows:Firstly, we applied a method of picking up the tumor gene expression data——a feature extraction method named as Constrained Maximum Variance Mapping (CMVM) into extracting tumor samples genes feature. Then we made a classification by K-NN classifier. In the two-class classification experiments, we performed a feature extraction and recognition rate analysis to the prostate cancer dataset and the breast cancer dataset. In the multi-class classification experiments, we performed a feature extraction and recognition rate analysis to the Leukemia dataset and the central nervous system tumors dataset. We confirmed the feasibility and the effectiveness of the method through the feature extraction and recognition rate analysis experiments of different tumor samples genes. Secondly, we applied the manifold learning algorithm to the feature extraction of cross-platform tumor samples gene expression data. Then we classified them by K-NN classifier for comparing their recognition effect.Finally, this paper pointed out that there were still some existing problems about the present tumor gene expression data feature extraction and classification, and a further research still needed to be done in the future. | | Keywords/Search Tags: | Manifold learning, Feature extraction, LDA, PCA, PLS, SLLE, CMVM, Subspace, Cross-platform, Gene expression data, LLDE | PDF Full Text Request | Related items |
| |
|