Font Size: a A A

Sharpening the Edge of Tools for Microbial Diversity Analysis

Posted on:2013-08-26Degree:Ph.DType:Dissertation
University:University of Southern CaliforniaCandidate:Hao, XiaolinFull Text:PDF
GTID:1453390008965540Subject:Biology
Abstract/Summary:
Metagenomics studies have prospered from the rapid development of next-generation sequencing. However, microbial diversity analysis as an essential component of metagenomics is still facing three major challenges: handling errors in data, performing analysis efficiently for large data and avoiding primer bias issue. Since 16S rRNA gene sequences have been frequently used to profile microbial diversity, we focus on this data and successfully provide solutions to all three challenges: our proposed unsupervised Bayesian clustering method termed Clustering 16S rRNA for OTU Prediction (CROP) can find clusters based on the natural organization of data without setting a hard cutoff threshold (3%/5%) as required by hierarchical clustering methods. By applying our method to several datasets, we demonstrate that CROP is robust against sequencing errors and that it efficiently produces more accurate results than conventional hierarchical clustering methods. We also built a generic model for comparing 16S rRNA gene fragment data extracted from metagenomic shotgun sequencing data with targeted 16S rRNA sequencing data. This model, when combined with future benchmarking studies, could help validating 16S rRNA gene fragment data's ability to avoid primer bias and provide unbiased microbial diversity estimates. Our proposed analysis pipeline could also be implemented for future 16S rRNA gene fragment-based studies.
Keywords/Search Tags:Microbial diversity, 16S rrna gene, Studies, Sequencing
Related items