Font Size: a A A

Identifying gene-gene interactions and transcription regulators via dimension reduction methods

Posted on:2011-05-08Degree:Ph.DType:Dissertation
University:Michigan Technological UniversityCandidate:Cui, XiaoqiFull Text:PDF
GTID:1443390002958421Subject:Biology
Abstract/Summary:
The advent of whole-system approaches, such as DNA chips and high-throughput sequencing, has created opportunities for exploring new computational dimension reduction algorithms in modern genetic analysis. In this dissertation, I have applied dimensional reduction methods to solve two problems in the field of genetics: first detecting gene-gene interactions (or epistasis), and second, identifying candidate transcription regulatory genes (or transcription factors).;For the first problem, I proposed two combinatorial statistical methods: MCSM & CPMDR. MCSM (Multivariate Combinatorial Searching Method) is designed to identify a set of loci that are associated with multiple traits. It can take into account multiple phenotypes at one time, and utilizes various techniques of feature selection to search for a set of disease-susceptibility genes that may have interactions. By applying MCSM on GAW16 (Genetic Analysis Workshop 16) rheumatoid arthritis data, we have identified a significant gene-gene interaction between two genes, PTPN22 and TRAF1-C5.;CPMDR is a novel likelihood-based combinatorial method to locate interplaying genes using only cases and their parents. It utilizes a score of the conditional likelihood for each nuclear family (parents and diseased children) to partition the multi-locus genotypes into high and low risk classes. Our simulation results showed that CPMDR gained uniformly better performance to detect underlying interactions compared to other popular methods in a variety of scenarios.;As to the second problem, I have designed an automated algorithm that combines adaptive sparse canonical correlation analysis (ASCCA) as well as k-mean clustering analysis for recognizing transcription factors (TFs) involved in a biological process using pooled gene expression data from publicly available resources. This algorithm is demonstrated to be highly efficient in ranking known or inferring novel transcriptional factors, and multifunctional TFs can also be identified by intersecting the gene lists involved in different biological processes.
Keywords/Search Tags:Transcription, Interactions, Gene-gene, Reduction, Methods
Related items