Font Size: a A A

Study On The Statistical Methods In Classifying Samples By Gene Expression Profile

Posted on:2007-02-16Degree:MasterType:Thesis
Country:ChinaCandidate:X FangFull Text:PDF
GTID:2144360185485093Subject:Epidemiology and Health Statistics
Abstract/Summary:PDF Full Text Request
Objective: The aims of this paper were to explore the statistical methods in classifying samples by gene expression profile, especially to explore the effective methods to reduce the number of genes, to evaluate the statistical methods to effectively classify samples and meanwhile to explore the application of the method to examine the effect of classifying.Methods: Gene data in the study was consisted of 72 leukemia patients and their 7,129 gene expression data. Firstly the leukemia genes were removed by methods of eliminating the genes whose expression values changed little, variables clustering analysis, selecting the representative genes and colinearity diagnostics, secondly the representative gene variables with 11 clustering analysis methods were analyzed, thirdly 11 clustering analysis methods were applied to classify the samples of leukemia and meanwhile the prediction strength(PS) method was applied to examine the clustering effect of the 11 clustering analysis methods. Lastly the error rate examination method and the validity targets were applied to evaluate the feasibility of the PS method.Results: 48 significantly correlated genes were selected by methods of eliminating the genes whose expression values changed a little, variables clustering analysis selecting the representative genes and colinearity diagnostics.Only the Flexible-Beta method and Ward's minimum-variance method could cluster leukemia gene data into two kinds of leukemia.Comparison and evaluation of 11 clustering analysis methods by the PS method showed that the Flexible-Beta method was most adapted to cluster the samples of leukemia gene data among 11 methods of clustering analysis.The result of evaluating the 11 clustering analysis methods by the PS method was the same with the one by the error rate examination method and the validity targets.Conclusions: It is feasible to classify samples by analyzing leukemia gene...
Keywords/Search Tags:clustering analysis, leukemia, gene expression, gene selection, statistical method
PDF Full Text Request
Related items