Font Size: a A A

Statistically Controlled Identification Of Differentially Expressed Genes In One-to-one Cell Line

Posted on:2018-06-03Degree:MasterType:Thesis
Country:ChinaCandidate:J HeFull Text:PDF
GTID:2334330536978808Subject:Biochemistry and Molecular Biology
Abstract/Summary:PDF Full Text Request
The Connectivity Map(CMAP)database,an important public data source for drug repositioning,archives gene expression profiles from cancer cell lines treated with and without bioactive small molecules.However,there is only one or two technical replicates for each cell line under one treatment condition.For such small-scale data,current methods lack statistical control in identifying differentially expressed genes(DEGs)in treated cells.Especially,one-to-one comparison may result in too many drug-irrelevant DEGs introduced by random experimental factors.To tackle this problem,CMAP adopts a pattern-matching strategy to build “connection” between disease signatures and gene expression changes associated with drug treatments.However,many drug-irrelevant genes may blur the “connection” if all the genes are used instead of pre-selected DEGs induced by drug treatments.We applied OneComp,a method produced by adapted RankComp,to identify DEGs in such small-scale cell line datasets.For a cell line,a list of gene pairs with stable relative expression orderings(REOs)were identified in a large collection of control cell samples measured in different experiments and they formed the background stable REOs.When applied the OneComp method to a small-scale cell line dataset,the background stable REOs were customized by filtering out the gene pairs with reversal REOs in the control samples of the analyzed dataset.To mimic the dataset with only one technical replicate,we generated several independent pairs of treated and control samples by dividing a large data set into sub-datasets.DEGs identified in the large data set were used as golden standard to evaluate the performance of OneComp in each sub-datasets.The consistency scores of overlapping genes between DEGs identified by OneComp and SAM were all higher than 99%.The average consistency score of the DEGssolely identified by OneComp was 96.85% according to the observed expression difference method.The usefulness of OneComp was exemplified in drug repositioning by identifying phenformin and metformin related genes using small-scale cell line datasets which helped to support them as a potential anti-tumor drug for non-small-cell lung carcinoma,while the pattern-matching strategy adopted by CMAP missed these “connections”.The implementation of OneComp is available at https://github.com/pathint/reoa.Result show that OneComp performed well in both the simulated and real data.It is useful in drug repositioning studies by helping to find hidden “connections” between drugs and diseases.
Keywords/Search Tags:The Connectivity Map, Differentially expressed genes, Drug repositioning, Phenformin, Metformin
PDF Full Text Request
Related items