Font Size: a A A

A Comprehensive Evaluation Of Computational Tools For Analyzing DE Of MicroRNAs And Quantifying Immune Cells Using DNA Methylation Data

Posted on:2021-05-07Degree:DoctorType:Dissertation
Country:ChinaCandidate:AmanullahFull Text:PDF
GTID:1360330614467707Subject:Bioinformatics
Abstract/Summary:PDF Full Text Request
In this dissertation,we performed comprehensive evaluations of computational tools for 1)detecting differential expression of micro RNA(miRNA)from small RNA sequencing data,and 2)estimating of immune cells proportion using DNA methylation(DNAm)data from complex tissues.In Chapter 1,we provided an overall review of miRNA.Here,we focused on the biogenesis of miRNA,miRNA isoforms with classification,target prediction of miRNA,and roles of miRNA in cancer.In Chapter 2,we focused on a literature review of deconvolution methods using DNAm data in cancer.We briefly discussed the potential significance of EWAS,DNAm database,DNAm-based deconvolution analysis,and immune response to cancer.In Chapter 3,miRNA isoforms(isomiRs)are produced from the same arm as the archetype miRNA with a few nucleotides different at 5 and/or 3 termini.These well-conserved isomiRs are functionally important and have contributed to the evolution of miRNA genes.Accurate detection of differential expression of miRNAs can bring new insights into the cellular function of miRNA and a further improvement in miRNA-based diagnostic and prognostic applications.However,very few methods take isomiR variations into account in the analysis of miRNA differential expression.To overcome this challenge,we developed a novel approach to take advantage of the multidimensional structure of isomiR data from the same miRNAs,termed as a multivariate differential expression by Hotelling's T~2 test(MDEHT).The utilization of the information hidden in isomiRs enables MDEHT to increase the power of identifying differentially expressed miRNAs that are not marginally detectable in univariate testing methods.We conducted rigorous and unbiased comparisons of MDEHT with seven commonly used tools in simulated and real datasets from The Cancer Genome Atlas.Our comprehensive evaluations demonstrated that the MDEHT method was robust among various datasets and outperformed other commonly used tools in terms of type I error rate,true positive rate,and reproducibility.In Chapter 4,it is a challenging task to identify innovative biomarkers corresponding to immune cell type quantification in bulk tissue samples in large Epigenome-Wide Association Studies(EWAS).Although recently various deconvolution methods using DNAm data have been developed,their performances have not been thoroughly evaluated.Here,we presented a novel reference data based on Cp G islands from DNAm data,which provides more accurate quantification of immune cell types from a bulk tissue sample.We used the ANOVA model followed by Tukey's honestly significant difference(HSD)test to select each cell-type-specific Cp G sites that are highly methylated on that cell type.Furthermore,we conducted a comprehensive simulation study to compare the performance of six cell deconvolution methods using our newly proposed reference data.The six methods evaluated in our study are commonly used for estimating cell-type compositions from bulk tissues,including CIBERSORT,Constrained Projection(CP),Robust Partial Correlations(RPC),Estimate the Proportion of Immune and Cancer cells(EPIC),Decon RNASeq,and Ref Free EWAS.Our comprehensive evaluations showed that the reference-based CP method outperformed the other deconvolution methods in terms of model prediction error rate.Accurate estimation of cell-type proportion from bulk tissue samples can bring new insights into disease pathogenesis in EWAS studies.In real data analysis,we used DNAm and gene expression datasets from pan-cancer TCGA samples.For the gene expression data,we used the LM22 reference signature data matrix.Immune cell type compositions were respectively estimated using gene-expression and DNAm data among various cancer types.Finally,survival analysis showed that deconvolution of immune cell-subtypes based on DNAm data may help the physician to make a better clinical decision and improve the immunotherapies on patients.
Keywords/Search Tags:Cancer, Deconvolution analysis, Differential expression, miRNA isoforms, Hotelling's T~2 statistic, Multivariate analysis, EWAS, Cellular heterogeneity, DNA methylation
PDF Full Text Request
Related items