Font Size: a A A

A Multivariate Computational Framework to Identify Genomic Biomarkers in Cancer

Posted on:2014-02-14Degree:Ph.DType:Dissertation
University:Yale UniversityCandidate:Togun, Taio AFull Text:PDF
GTID:1454390005488632Subject:Biology
Abstract/Summary:PDF Full Text Request
Advancements in technology have opened up the opportunity to seek deeper knowledge of the underlying mechanisms that drive human diseases such as cancer. These advancements have made possible the simultaneous measurement of thousands of clinical and molecular variables which can be used to guide the design of drugs targeted to a relevant molecule, and thus lead to the elucidation and better management of cancer. However, given the size of the resulting catalog of variables, identifying relevant key players (biomarkers) from such an enormous amount of data becomes a serious challenge. Hence, it is important to apply appropriate multivariate statistical and computational methods to identify the most informative biomarkers so that the biological advancements can be translated into improved patient care.;In this dissertation, we employ a powerful multivariate method, Random Forest, to derive a three-stage computational framework to identify genomic biomarkers in cancer. This framework derives a comprehensive list of relevant features associated with outcome, and employs the recently developed partitioning/deletion/substitution/addition (partDSA) algorithm to derive parsimonious, testable, and clinically interpretable models that offer insight into potential combinations of the identified variables of importance.;As part of our overall goal to translate advances in biology to improved patient care, we apply this framework to a glioblastoma study to identify SNPs that are associated with patients' survival. We derive clinically interpretable and testable partDSA models of these SNPs and discuss potential biological implications of some known genes linked to them. In addition, in order to better identify patients who will benefit from Trastuzumab targeted therapy, we investigate the use of in Situ quantitative measurement of Human Epidermal Growth Factor Receptor 2 (HER2) mRNA as a determinant of trastuzumab response in a HER2-overexpressing metastatic cohort. We find that HER2 mRNA as well as other HER2 markers assessed by the Automated Quantitative Analysis (AQUA) method is significantly associated with patients' time to progression. A final partDSA model of the markers selects SP3 and ITER2 mRNA as the best combination of variables for risk stratification in our data. Finally, we apply our framework to identify genomic biomarkers of response to trastuzumab-based therapy in a separate metastatic cohort. We identify candidate DNA copy number regions as well as genes that are associated with patients' response, and derive interpretable and testable risk stratification models of these candidate biomarkers. These models provide useful insight into potential relevant and meaningful biological combinations of candidate biomarkers that can help explain patients' response to trastuzumab-based therapy.
Keywords/Search Tags:Biomarkers, Framework, Associated with patients', Cancer, Multivariate, Computational, Relevant, Response
PDF Full Text Request
Related items