Font Size: a A A

Research On Statistical Analysis Methods Of Gene Expression Data Based On Metric Learning

Posted on:2022-12-29Degree:MasterType:Thesis
Country:ChinaCandidate:G HanFull Text:PDF
GTID:2510306611995849Subject:Biology
Abstract/Summary:PDF Full Text Request
The Pearson correlation coefficient(PCC)is the most widely used method to measure the similarity of co-expression between genes.However,due to the diversity of biological processes and the heterogeneity of public expression datasets,PCC may not be the optimal method to measure co-expression.In order to obtain better results of measuring co-expression between genes,This paper proposes Metric Learning for Co-expression(MLC),which is a fast algorithm that assigns a specific weight to each expression sample.More specifically,if two genes are annotated with a given Gene Ontology(GO)term,MLC maximizes the weighted co-expression between them;if only one gene is annotated,then the weighted co-expression is minimized.When calculating the co-expression of all samples,focus the experiment on the samples with high weight,while the samples with low weight will have less influence on the experiment.In order to verify the performance superiority of MLC,four methods were used to measure the co-expression of publicly available prostate cancer data,and the results were compared and analyzed.The experimental results show that the performance of the co-expression measurement method weighted by the MLC algorithm is significantly better than that of the standard PCC,which can more accurately find out which genes complete the same biological process under different conditions.
Keywords/Search Tags:Gene co-expression, GO term, GEO database, Metric learning
PDF Full Text Request
Related items