Font Size: a A A

Research On Multi-resolution Shape Clustering Algorithm Based On Transcriptome Data And LncRNA-related Cancer Molecular Target Recognition

Posted on:2018-06-04Degree:MasterType:Thesis
Country:ChinaCandidate:Y HeFull Text:PDF
GTID:2334330515473963Subject:Engineering
Abstract/Summary:PDF Full Text Request
In current world,the death rate of gastric cancer is at the forefront of cancer.Gastric cancer patients in China account for about 40% of the total number of patients with global gastric cancer.In recent years,some long non-coding RNA(lncRNA)have been found to be dysregulated in many cancers.It raised the interest of the researchers on the relationship between lncRNA and cancer.The detailed molecular mechanisms of gastric cancer are largely unclear at present,especially for lncRNA.The interaction between cancer and lncRNA can be obtained through biological experiments,however,experimental identification of cancer-associated lncRNA often has high time complexity and high cost.In this paper,a computational approach is proposed to determine the relationship between lncRNA and gastric cancer by repeating the use of exon-based gastric cancer arrays.A specific lncRNA(LINC00365)and its target differentially expressed genes were identified and their products predicted to be excreted in blood,urine,or saliva were identified as candidate biomarkers for gastric cancer through experiment.Further biological functions and molecular mechanisms of lncRNA and encoding genes biomarkers related gastric cancer are inferred from knowledge of multisource biology.The main work of this paper is to analyze the lncRNA.First re-annotated the probe in the exon array data of the tumor under the Human Exon 1.0st array platform in the GEO database,to obtain the expression data of the lncRNA and the coding gene.Then data related to gastric cancer were calculated from the rank sum test using the p-value and the fold change of these expression data genes in tumor versus normal sample data.When the value of fold change is greater than 1.5 or less than 1 / 1.5 and p-value is less than 0.01,the gene may be considered to have significant differences in expression.Next,we calculated the Pearson and Spearman correlation coefficients to construct the interaction network of the selected genes,and use the logical function transformation to integrate Pearson and Spearman to calculate the weight in order to show the significant co-expression relationship.The GO process and pathway enrichment analysis of the coding genes co-expressed with the differentially expressed lncRNA were used to obtain the biological function lncRNA involved.Finally,to determine whether the coding genes associated with lncRNA can be secreted into the body fluid,to find a combination biomarkers of gastric cancer and verify.The second part of this paper is to study the time courses gene expression profile,which is the dynamic data with time.Through the analysis of data downloaded,we can obtain the more significant statistical properties and significant biological significance in the data.In recent years,the interest of scientists in time courses data mining has increased,and developing effective analytical methods is a major challenge.The expression value of time-courses gene data is different at every time points.The experiments based on time-courses expression provide an opportunity to explore gene expression profiles over time and to understand the dynamic behavior of gene expression,which is essential for the development of biology and disease.Based on multi-resolution fractal feature and hybrid clustering model algorithm,this paper explores the pattern of gene expression with time at different resolutions.The multiresolution fractal feature is obtained by wavelet decomposition,which is a probabilistic framework that provides a more natural and robust method for clustering analysis,and the identified grouped genes have stronger biological significance.Therefore,we used the multi-resolution hybrid clustering algorithm to analyze the time-courses gene expression profile data associated with the tumor,and obtained the global and local fractal features,which divided the data into significant biological clustering.
Keywords/Search Tags:Bioinformatics, time-courses gene expression profile, long noncoding RNA, differential expression, co-expression network, biomarkers, multi-resolution shape clustering algorithm
PDF Full Text Request
Related items