Font Size: a A A

Explore Relationships Of Genomic And Clinical Data Based On Integrated TCGA Datasets

Posted on:2017-05-09Degree:MasterType:Thesis
Country:ChinaCandidate:Z Z HuangFull Text:PDF
GTID:2180330485957126Subject:Biomedical engineering
Abstract/Summary:PDF Full Text Request
Several large-scale human cancer genomics projects have been launched (such as TCGA, ICGC) which offer huge genomic and clinical data for researchers. These data support researchers mine meaningful genomics alterations which affect the development and metastasis of tumors. However, basic medical researchers and translational medicine researchers without enough knowledge of data analysis and training in bioinformatics face an embarrassing situation that they haven’t enough abilities to utility these text files.As medical information researchers, we need to use informatics and statistics technical on cancer genomics data analysis, as a bridge of big data and basic medical researchers to help them to explore these data. Therefore we propose to develop an online cancer genomic analysis platform (TCGA4U:http://www.tcga4u.org:8888) for basic medical researchers and translational medicine researchers to offer data analysis services of TCGA text data. This paper needs to solve these problems:1. Building a cancer genomic knowledge base to support upper application through integrating genomic text files and clinical data.2. Providing genomic data analysis services for researchers to analyze many genomic data types and clinical data and help researchers to deep explore these data and relationships of genomic data.3. Guiding researchers to use this platform to analyze genomic data related biological processes and signal paths.This paper comes up with a system construction to build cancer genomic data analysis platform. Firstly, through integrating somatic mutations, gene expression, DNA methylation, Copy Number variants data and expanding Gene Ontology related data, the human genome reference sequence (CRCh37), EBI Molecular Interaction Database (EBI-IntAct) to build a basic cancer genomic knowledge base and offer data support and services for upper applications. For fast handling huge data and using kinds of statistics algorithms, platform chose R as our statistics engine to offer data analysis services. While, we developed some algorithms and text handle modules to offer interfaces to upper applications. The upper display module, we developed basic data displays and visualization displays to help users view statistics analysis results.In the end, this paper use a research case based on the platform to introduce how to use this platform to carry out research. Paper discusses gene expression pattern related to breast cancer survival using platform knowledge base and obtains two results:1. mitochondrial ribosomes play a more crucial role in the cancer development, total expression of mitochondrial ribosome and cytosol ribosome is at balanced state; 2. HSPA2 has a widely different gene expression pattern that patients have poor survival at low HSPA2 expression level in breast cancer compared with previous findings in other cancer types. We also checked four breast cancer datasets at the Oncomine and the Netherlands Cancer Institute (NKI) breast cancer dataset to verify the conclusion.In this paper, we put forward and develop an online cancer genomic data analysis platform to deep explore relationships of genomic and clinical data. Translational medicine and basic medical researchers explore kinds of TCGA genomic and clinical data and mine the role of genomic and clinical data in the development and metastasis of tumors based on platform. Meanwhile, we publish the latest research results to make a significant contribution to mine drive genomic alterations and make personality therapy.
Keywords/Search Tags:genomic data analysis, survival analysis, data mining, prognosis assessment
PDF Full Text Request
Related items