Font Size: a A A

Maxmean Based Expressed Gene Scanning

Posted on:2014-02-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y C PengFull Text:PDF
GTID:2267330425972985Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
Selection of the gene expression data is of great importance in modern bioinformatics research. At present, no systematical method is available to scan genes containing significantly expressed biological information form gene expression data that are characterized with high dimension, small sample, and great redundancy. In light of this situation, a combination between the cluster analysis and test statistic was made in this paper, which is capable of selecting genes that contribute most to the sample type. Uninformative variable elimination (UVE) based on the least square method and the Mento Carlo method were both applied to eliminate genes that does not contain or contain but very few desired biological information. The intersection set of genes from the two groups of selected genes by the two methods was figured out, and the self-organizing mapping (SOM) algorithm was used to conduct a cluster analysis upon this set of genes which are correlated with each other in certain degree. Based upon the cluster analysis, corresponding subsets of genes were established. A simulative numerical test was made to choose the gene set analysis method (GSA) to build the max-mean for the selection subsets consisting of significantly expressed genes in later statistical analysis. Projection of principal component analysis (PCA) was then used to verify the efficiency of the selected subsets of genes in categorizing the subject sample.Test sample used consists of45,037data concerning four sets of normal genes and four sets of cancer genes collected in a hospital test.1,681candidate genes were selected from the sample. These selected genes were divided into50subsets of genes by the SOM method. Max-mean statistics were then determined for these50subsets of genes, which help to select four groups of significantly expressed genes. These four groups of genes consist of two groups of up-regulated genes Set17and Set5, and two groups of down-regulated genes Set3and Set13. The four groups of genes were used to divide the two types of gene samples as mentioned above. This paper includes18figures,8tables and61references.
Keywords/Search Tags:UVE, SOM, gene sets, max-mean
PDF Full Text Request
Related items