Font Size: a A A

Prediction Of Phage-bacteria Interaction Signals

Posted on:2021-02-16Degree:MasterType:Thesis
Country:ChinaCandidate:R GanFull Text:PDF
GTID:2370330611999163Subject:Bio-engineering
Abstract/Summary:PDF Full Text Request
Phage is a kind of virus that specifically infects bacteria.It coexists with bacteria in the natural environment and evolves together,affecting the entire ecological environment.Phage-host interaction provides an effective means to study the adaptive evolution of bacteria,and it plays an increasingly important role in human health and disease.Because phage has the characteristics of high specificity,strong exponential proliferation ability,few adverse reactions and extremely rich types,the application of phage to achieve the regulation and transformation of complex intestinal flora will help to develop new therapeutic agents for the treatment of drug-resistant bacteria and targeted phage therapy.With the popularization of high-throughput sequencing technology,massive viral sequences with missing host information urgently require to be predict their hosts for knowing about the virus-host dynamic interactions on the microbial community.This study aims to comprehensively predict phage-host interactions by integrating multiple phage-host interaction signals,using multiple machine learning algorithms.Methods: First,build server environments for predicting phage-host interaction signals including CRISPR,prophage,genetic homology,protein-protein interaction and sequence composition.Second,13,055 bacterial genomes and 10,463 bacteriophage genomes were downloaded from NCBI and references to build an overall database on the phage-host interaction signals.Third,develop the algorithm procedures for evaluating interacting probability for a pair of phage and bacterial genomes,predicting the infecting phage for a query prokaryotic genome and predicting the bacterial host for a query phage genome respectively,using 7different machine learning algorithms(random forest,decision tree,Bayesian,Logistic regression,support vector machine)trained on 18 signal features of known phage-host pairs,combined with two criteria.The procedure achieves prediction for each phage-host interaction signal and integrates all the predicted results.Results: The phage-bacteria interaction prediction algorithm realizes two-way prediction from three angles,predicting the host,interacting bacteriophage,and evaluating a pair of bacteriophage-bacteria interactions.In addition to using published tools,the prediction of prophage is also developed.A set of algorithms combining density-based spatial clustering algorithm and sliding window method to predict the prophage region.For these 5 interaction signals,18 related signal characteristics that can represent the interaction between bacteriophage and bacteria are defined.On machine learning models,using 10-fold cross-validation,817 pairs of phage-bacteria with known interactions were used for parameter training.After obtaining the optimal parameters,936 pairs of phage-bacteria with known interactions that were different from the training set were used as the test set,the prediction accuracy is as high as 0.875,and the area AUC under the ROC curve reaches 0.93.In addition,using the standalone version to predict the host of 125,842 metagenomic virus contigs,the prediction rate is as high as 54.54%,while Paez-Espino et al only made predictions for only 7.7% of the m VCs.The Webserver has been successfully built,providing rich and personalized graphics.The websitehas been visited and used by different users from many countries.
Keywords/Search Tags:Bacteriophage, CRISPR, Prophage, oligonucleotide frequency, Protein-protein interaction
PDF Full Text Request
Related items