Font Size: a A A

Construction And Application Of Immune Repertoire High-throughput Sequencing Data Analysis Pipeline

Posted on:2016-06-24Degree:MasterType:Thesis
Country:ChinaCandidate:P LiFull Text:PDF
GTID:2284330479982138Subject:Bioinformatics
Abstract/Summary:PDF Full Text Request
Advances in high-throughput sequencing have enabled the development of a powerful technique for profiling the repertoire of the adaptive immune system. In a single high-throughput sequencing experiment, millions of T cell receptor(TCR) and B cell receptor(BCR) sequences can be generated in parallel from a single sample. A new field of immunosequencing has emerged with the development of this technology, along with a set of promising applications in antibody discovery, vaccine design, understanding immune repertoire development, infectious diseases, autoimmune diseases and cancer.Although there are several bioinformatics softwares available for TCR/BCR VDJ gene sequence analysis, they are not capable of analyzing the immune repertoire functions such as the clone clustering, the diversity and lineage structure of immune repertoire. The development of immunosequencing calls for novel bioinformatics tools that can characterize the “repertoire-level” dynamics and diversity of immune systems. How to predict the total size of immune repertoire based on available reads, how to monitor the dynamic change of BCR/TCR clones following vaccination, how to evaluate the diversity of immune repertoire after bone marrow transplantation, and how to identify the public autoantibody or auto T cells in autoimmune diseases, these and many other issues in clinical application calls for software that can describe the dynamics and diversity of immune repertoire. Here, we present a novel immune repertoire data analysis pipeline, a tool for processing raw data and characterizing the function of immune repertoire in high throughput sequencing immune repertoire experiment. Its novel functions include that(1) it allows to analyze the immune repertoire at multiple time points of a single subject to track the dynamic change of immune response;(2) it can estimate the total size of immune repertoire from available sequence reads;(3) it can evaluate the diversity of immune repertoire by various metrics,(4) it can identify public TCR/BCR clones among different samples, and(5) it can plot the lineage structure by network methods to reveal the detailed mutations within a lineage.Also, we applied this pipeline in the study of VDJ gene recombination bias of the human IGH repertoire. The phenomenon of VDJ gene recombination bias means that gene recombination does not follow a totally random rule. This phenomenon may reveal the true process of B cell development and affect the definition of the true healthy immune repertoire. We selected 16 healthy persons’ IGH repertoire data from two published papers. In addition to the bioinformatics method mentioned above, we also used statistical methods to evaluate the statistical significance in this study. This study is innovative due to the studied datasets at population and immune repertoire level, rather than a single individual or a single gene. We draw the following conclusions:(1) at the population and immune repertoire level, we confirmed that the VDJ gene usage bias exists, and it is conservative between different persons;(2) at the population and immune repertoire level, we confirmed that the productive sequences and the nonproductive sequences were not different, thus the phenomenon of VDJ gene usage bias is not caused by cell selection pressure;(3) by comparison between different persons’ VDJ gene frequencies, we found and tested high correlation among different persons, however, there may exist exceptions for J gene.
Keywords/Search Tags:immune repertoire, recombination, VDJ, pipeline, bioinformatics
PDF Full Text Request
Related items