Font Size: a A A

Research Of Marker Gene Selection For Tumor Classfication Based On Gene Expression Profiles

Posted on:2009-03-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y H DuanFull Text:PDF
GTID:2144360242994096Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
In the last few years, DNA microarray technology has become a fundamental tool in genomic research, and has been introduced a paradigmatic change in biology by shifting experimental approaches from single gene studies to genome-level analyses. Recent studies have shown that microarray provides a successful way to a comprehensive understanding of the genetic alterations present in tumors. However, this method has a major challenge because of the characteristics in microarray data set, which has the very high dimensionality (large number of genes) with a small number of samples in the data set. Feature selection (marker gene selection) is crucial for several of reasons in the task of tumor classification using the gene expression data, such as improving classification accuracy, reducing the cost in a clinical setting and gaining significant insight into the mechanism of disease.In this dissertation, marker genes selection for tumor classification is investigated based on gene expression data. The main contributions of our research are summarized as below:1.We developed the standard SVM-REF based on correlation to extract marker genes from the small set of representative genes. The improved SVM-RFE accelerates without reducing accuracy the standard SVM-RFE method. This method has been implemented on ALL/AML dataset. Experimental results have shown that our method can achieve to select few of marker genes with minimum redundancy but getting better classification accuracy.2.The usability of K-means is limited to analyses gene expression data by its shortcoming that the clustering result is heavily dependent on the user-defined variants, i.e. the number of clusters (k). We proposed a Heuristic K-means clustering algorithm, which can automatically determine a semi-optimal number of clusters according to the statistical nature of data. The results witness to the successful automatically partition of the gene expression data.3.We proposed a novel hybrid approach, Filter Clustering SVM-RFE, to improved gene selection for tumor classification. Our method combines gene ranking, clustering and wrapper method to select marker genes for tumor classification, and it takes the advantages of both gene ranking's efficiency and wrapper methods'high accuracy. Our method has been implemented on two public datasets, and experimental results have shown that our method can achieve to select fewer of marker genes getting better classification accuracy with minimum redundancy...
Keywords/Search Tags:feature genes selection, gene expression profiles, tumor classification, bioinformatics
PDF Full Text Request
Related items