Font Size: a A A

Parameter estimation and network identification in metabolic pathway systems

Posted on:2009-08-13Degree:Ph.DType:Dissertation
University:Georgia Institute of TechnologyCandidate:Chou, I-ChunFull Text:PDF
GTID:1440390002490540Subject:Biology
Abstract/Summary:
Cells are able to function and survive due to a delicate orchestration of the expression of genes and their downstream products at the genetic, transcriptomic, proteomic, and metabolic levels. Since metabolites, the end products of gene expression, are ultimately the causative agents for physiological responses and responsible for much of the functionality of the organism, a comprehensive understanding of cell functioning mandates deep insights into how metabolism works. However, the regulation and dynamics of metabolic networks are often too complex to allow intuitive predictions, which thus renders mathematical modeling necessary as a means for assessing and understanding metabolic systems.;The construction of mathematical models for metabolic pathways is challenging, and a particularly complicated task is the estimation of model parameters and the identification of network structure. Recent advancements in modern high-throughput techniques are capable of producing time series data that characterize dynamic metabolic responses and enable us to tackle estimation and identification tasks using "top-down" or "inverse" approaches. However, extracting information regarding the structure and regulation of the system described by these data is difficult. The challenges can be generally categorized in four problem areas, namely: data related issues, model related issues, computational issues, and mathematical issues.;To develop improved methods for inverse modeling that are effective, fast, and scalable, this work proposes two novel algorithms namely Alternating Regression (AR) and Eigenvector Optimization (EO), both applied to S-systems in Biochemical Systems Theory (BST). The AR method employs a decoupling technique for systems of differential equations and dissects the complex nonlinear parameter estimation task into iterative steps of linear regression by utilizing the fact that power-law functions are linear in logarithmic space. AR is very fast in comparison to conventional methods and works well in many applications. In cases where convergence is an issue, the fast speed renders it feasible to dedicate some computational effort to identifying suitable start values and search settings. AR is beneficial for the identification of system structure in S-systems as well.;A modification of the AR algorithm is 3-way Alternating Regression (3-AR), which was applied here to parameter estimation in S-distributions that form a statistical distribution family motivated by S-systems. 3-AR preserves the properties of AR but iterates the algorithm between three phases of linear regression. The 3-AR algorithm is very fast and performs well for artificial, error-free and noisy datasets, as well as for random samples generated from traditional statistical distributions and for observed raw data.;The EO method is an extension of AR that is based on a matrix formed from multiple regression equations of the linearized decoupled S-systems. In contrast to AR, EO operates initially only on one term, whose parameter values are optimized completely before the complementary term is estimated. It was demonstrated that the EO algorithm converges fast and can be expected to converge in most cases, without necessarily requiring knowledge of the network structure. Furthermore, EO is easily extended to the optimization of network topologies with stoichiometric precursor-product constraints among equations.;To integrate all existing techniques and make inverse modeling more effective, this work proposes an operational "work-flow" that guides the user through the estimation process, identifies possibly problematic steps, and suggests corresponding solutions based on the specific characteristics of the various available algorithms. A significant consequence and advantage of the combined approach is that the result often consists of multiple parameter sets that are all consistent with the data and that can lead to hypotheses for further theoretical and experimental investigation. Finally, the work described here discusses a recent Dynamic Flux Estimation (DFE) approach, which resolves open issues of model validity and quality beyond residual errors. The necessity of fast solutions to biological inverse problems is discussed in the context of concept map modeling, which allows the conversion of hypothetical network diagrams into mathematical models.
Keywords/Search Tags:Network, Estimation, Metabolic, Identification, Systems, Mathematical, Modeling
Related items