Font Size: a A A

Docker Biological Cloud Computing Platform To Analysis High-Throughput Sequencing Data Of Bursaphelenchus Xylophilus

Posted on:2016-11-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:G ChengFull Text:PDF
GTID:1223330470977782Subject:Forest Protection
Abstract/Summary:PDF Full Text Request
As the second generation sequencing technology is widely used in the field of life science research, promote the rapid growth of sequence data. will be for life science and related industries to bring a new revolution. Ordinary laboratory can also achieve PT biological sequence data, so the effective storage of mass data sharing, efficient analysis, face the serious challenge in high performance computing system.Docker is a PaaS provider based on LXC (LinuX Containers open source project. Docker at present the rapid development in the field of cloud computing, including dotCloud, Google Compute Engine and baidu app Engine (BAE) is using the Docker.Docker, is good to build the advantage of cloud computing, Docker bringing a change to the cloud computing, Docker flexibility to encapsulate software, make its more quickly spread.This is always been the goal of cloud computing, the Internet is a giant computer.Biological bigdata also have bigdata "4V"characteristics:Volume, Variety, Value, Velocity. Docker convenient, efficient, adapted to the demand of biological big data such so Docker biological cloud computing platform is the best approach to biological data, so in this paper the following aspects were studied and discussed, and is applied to the platform were nematodes, through the analysis of high-throughput sequencing data personalized for analysis, processing of high-throughput sequencing large data provides a method and train of thought.The research content as follows:(1) Docker installation in Ubuntu operating system and basic Docker command.(2) Docker data management and n common use commands of basic, data storage and management is an important content, we discussed how to mount a host of data in the container and how to create data container, etc.(3) Based on ubuntu-14.04-x8664.tar.gz template, we create ubuntul4.04 biodocker image, due to some analysis software relies on all kind of environment, so we chose a relatively complete system as a base image.(4) Based on ubuntu14.04 biodocker image, we use three methods was discussed in the image installed in the genome, transcriptome and metagenomic and other related software and script, make Dockerfile, provide the reference for the readers to learn, build a biological cloud computing platform based on Docker image.Docker biological cloud computing platform will be quick and efficient to transfer any a Linux kernel of operating system platform, whether it is PC, cluster, Google, amazon’s cloud services.we set ubuntu14.04biodocker image in the PC and xiamen university data mining groups server.(5) Using the platform, B.xylophilus genome-wide secretory protein gene family is constructed, protein sequences, and the functional annotation.Results show that there are 923 secreted proteins were nematodes genes,93 genes only get comments, the remaining 90%of the secreted proteins are were unique to nematodes, is worth us to pay more attention to and subject to further research. We also development of SSR and primers,we search to 12135 SSR, developed 1155 primers, we have put these information into gff3 format file, that we can study intuitive and convenient observe the SSR position, type, length, primers and other detailed information in the genome browser.(6) Using the platform, based on the transcriptome B.xylophilus, B.mucronatus secreted protein differentially expressed and molecular evolution.Results indicate that transcriptome expression 800 secreted proteins, and which 294 secreted proteins expressed significant difference for the two, and has carried on the annotation and analysis to these proteins.from 498 secreted protein homologous genes, we found 16 genes Ka/Ks value was significantly greater than 1, and reached the differences statistically significant level, suggests that these genes are strong natural selection, in order to adapt to the environment the function evolution occurred significantly.(7) Using the platform we study on orthologous genes differentially expressed from transcriptome of B.xylophilus, B.mucronatus.Results indicate that a large number of homologous genesfrom C.elegans and M.halpa, the homologous genes and their expression, expressionthis will be provide reliable information of function annotation, and in the distant species A.thaliana and P.trichocarpa get some of the homologous genes,this for our study were nematodes interactions with host plant to provide the reference.(8) Using the platform we study on metagenomics of B.xylophilus and their associated bacteria horizontal gene transfer.results of that througth the method comparing calculation of GC percent,we get 15 horizontal gene transfer of B.xylophilus from the symbiotic bacteria, most of these genes are has important physiological and biochemical functions, in order to research nematodes and their associated bacteria cooperative coevolution provides strong evidence.
Keywords/Search Tags:Bursaphelenchus xylophilus, Docker, cloud computing, genome, transcriptome, metagenomics
PDF Full Text Request
Related items