Font Size: a A A

DBSCAN-SWA:an Integrated Tool For Rapid Prophage Detection And Annotation

Posted on:2022-02-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y SiFull Text:PDF
GTID:2480306572957129Subject:Biology
Abstract/Summary:PDF Full Text Request
Bacteriophage is an intracellular form in the bacterial host genome.Its high specificity in bacterial DNA can help horizontal gene transfer(HGT).With the exponentially increasing number of microbial sequences uncovered in genomic or metagenomics studies,there is a massive demand for a tool that is capable of fast and accurate identification of prophages.Here,we introduce DBSCAN-SWA,a command line software tool developed to predict prophage regions in bacterial genomes.Method:First,build and develop a server environment,tools and algorithms for predicting and annotating the prophage region.Secondly,establish a database.This study collected data from the viruses Uni Prot Tr EMBL and Millard Lab as a reference database,and downloaded the phage genome and protein database,which included 10,463 complete phage genome sequences and 684,292 non-redundant phage proteins.At the same time,we also collected 184 artificially managed prophage information from 50 complete bacterial genomes.Based on the Human Microbiome Project(HMP)project,we collected 400 bacterial genomes from the human gastrointestinal tract for algorithm performance comparison.Finally,the algorithm flow is developed,and a variety of methods such as density-based spatial clustering algorithm and sliding window method are used to achieve rapid detection and annotation of prophage regions.Result:DBSCAN-SWA runs faster than any previous tools.Importantly,it has great detection power based on analysis using 184 manually curated prop hages with a recall of 85% compared with Phage?Finder(63%),Vir Sorter(74%)and PHASTER(82%)for raw DNA sequences.Moreover,DBSCAN-SW A outperforms the existing standalone prophage prediction tools for high-throu ghput sequencing data based on the analysis of 19,989 contigs of 400 bacteri al genomes collected from HMP project.DBSCAN-SWA also provides user-fri endly result visualizations.Webserver has been successfully built,including a c ircular prophage viewer and interactive Data Tables,providing rich and personal ized graphic displays.DBSCAN-SWA is implemented in Python3 and is availa ble under an open source GPLv2 license from https://github.com/HIT-Immunol ogy Lab/DBSCAN-SWA.
Keywords/Search Tags:Prophage, Phage, clustering, phage-host interaction
PDF Full Text Request
Related items