Font Size: a A A

Study Of Methods Of Microbial Association Network Inference Based On Hierarchical Bayesian Model

Posted on:2020-01-23Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y Q YangFull Text:PDF
GTID:1360330626964472Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Understanding interactions among microbes and between microbes and their environment is a key research topic in microbial ecology.Due to the limitation of traditional culture-based study in laboratory,most interactions within real microbial community are unknown to biologists.The developing metagenomic sequencing technology helps researchers to learn the composition and abundance of microbes in real environments by analyzing microbial sequencing reads,and then association inference methods can be used to parse complex microbial interactions in the community.However,complexities of metagenomic sequencing process and microbial interactions bring a series of challenges for the design of association inference algorithm.Most of current methods aim at only one or two problems such as compositional bias,indirect associations or non-linear relationships,which cannot solve all aspects related to microbial association inference comprehensively.To handle existing problems of association inference,this thesis builds two microbial association inference approaches based on hierarchical Bayesian model,which take characteristics of metagenomic sequencing data and basic rules of microbial interactions into account.Major contributions of this study are as follows:1.Considering four problems of static association inference,compositional bias,over-dispersion,indirect associations and effects of environmental factors,we propose metagenomic Lognormal-Dirichlet-Multinomial model(m LDM),to estimate both conditionally dependent associations among microbes and direct associations between microbes and environmental factors,considering compositional bias and over-dispersion simultaneously.Indirect associations among microbes are removed by m LDM to improve interpretability of deduced associations.Efficiency of m LDM is validated on synthetic data and multiple real datasets.2.To model non-linear associations in complex microbial interactions,we propose k LDM model to infer multiple association networks based on variation of environmental factors using hierarchical Bayesian model of static association network inference algorithm,by assigning probability distribution to environmental factors.k LDM can estimate environmental conditions within sequencing datasets,and under each environmental condition,estimate both associations among microbes and associations between microbes and environmental factors.The environmental condition is calculated by comparing similarities of values of environmental factors,microbial abundance and associations,which help biologists analyze heterogeneities of microbial composition and associations.Its performance on real sequencing datasets proves that k LDM can analyze complex associations in big data.3.Combining two aforementioned association inference approaches,proposed hierarchical Bayesian framework by this study can be applied not only in static association inference with limited number of environmental factors and samples,but also in multiple association networks inference for complex microbial community with large samples and abundant environmental factors.This association inference framework has excellent extensibility.
Keywords/Search Tags:metagenomics, association inference, hierarchical Bayesian model, environmental condition, machine learning
PDF Full Text Request
Related items