Font Size: a A A

Identifying The Disease-associated Differentially Expressed Genes In The Absence Of Normal Control

Posted on:2019-04-28Degree:MasterType:Thesis
Country:ChinaCandidate:M R ChiFull Text:PDF
GTID:2404330569981099Subject:Biochemistry and Molecular Biology
Abstract/Summary:PDF Full Text Request
The basic analysis of gene expression is that identifying the differentially expressed genes(DEGs)between the two types of samples,for example,the disease compared to the normal.However,the research relevant to the vital organs,such as heart,pulmonary artery,brain,etc.It's too difficult to obtain the normal control,leading to the inability to identify the DEGs of disease in the state of absence of normal controls.In order to solve the problem of lacking of the normal controls(CTRs),many researchers have put forward lots of methods for analyzing gene expression profiles that detected by different labs,such as normalization,algorithms that may eliminate batch effect.In deed,these methods often distort the real biological signal due to batch effect.In the pre-research,basing on the theory that the relative orderings(REOs)of gene expression is overall stable in a particular type of normal tissue but widely disturbed in corresponding disease tissue,and REOs are insensitive to the experimental batch effects.The RankComp algorithm have been proposed to identify the individual-level DEGs.Therefore,the expression profiles of normal tissues which detected by different laboratory can be integrated into a dataset.However,when the accumulated samples are insufficient for analysis,RankComp has a poor performance in detecting the DEGs at population-level.In this work,we improve the performance of the RankComp to identify the population-level DEGs,and the improved algorithm called RankPop.To evaluate the algorithm,we firstly got 160 normal myocardia of left ventricular(LV)that comes from different laboratories in the open source database.Next,we got the gene pairs with significantly stable REOs(binomial distribution,FDR<0.05)which were consistently detected in samples measured by the same or different platforms.And then,we adopted RankComp and RankPop into the large scale or small scale simulated dataset respectively.As the result,in the small scale of dataset,the sensitivity of the population-level DEGs identified by RankComp was only 75.33%,while RankPop increased to 88.60%.These results manifest that RankPop has a good performance in identifying the population-level DEGs compared to RankComp.We adopted the RankPop method to identify the population-level DEGs.As the result,we got 137 patients with dilated cardiomyopathy(DCM)and 119 patients with ischemic cardiomyopathy.We compared the population-level DEGs detected above with the integrated DEGs' list of DCM or ICM which were identified by T-test,we found that 98.30% and 99.38% of population-level DEGs have the same dysregulated direction in DCM and ICM respectively.Lastly,we composed a PPI network of genes with high coverage in patients.As the result,in population-level DEGs,MNS1,SFRP4 and CCL2 were dysregulated over 85% patients with DCM.Similarly,25 genes were dysregulated over 85% patients with ICM.What's more,SFRP4,FURIN were the drug targets of heart disease.PPI network of these genes were significantly enriched in the pathway relate to heart,includes “Wnt signaling pathway”,“Hippo signaling pathway”,etc.Taken together,these analyses demonstrate profiles of the normal controls detected by different platforms can be integrated.Therefore,it not only provides an effective strategy for the analysis that lack of normal samples,but also be adopted to achieve the individual-level DEGs and population-level DEGs.
Keywords/Search Tags:gene expression profiles, normal controls, differentially expressed genes, gene expression orderings, cardiomyopathy
PDF Full Text Request
Related items