Objective(s): Multi-drug resistance is spreading at an alarming rate around the world,and phage therapy has promise for addressing the problem of "superbugs." Phage therapy requires a complex screening process,and the clinical application of phages has been slow due to the lack of guidelines for evaluating the safety of candidate phages.The current development of high-throughput sequencing technology has led to the rapid growth of phage genome data,and the continuous update and iteration of genomic analysis tools are expected to fundamentally change our understanding of phage and accelerate the application of phage therapy.This study may swiftly extract the basic properties of phage,provide a straightforward screening strategy for clinical applications,and have a profound public health influence through the design and analysis of the phage genome analysis pipeline.Methods: Based on the literature review,the preliminary screening and analysis of the software used in phage genome research were carried out.By assessing and comparing various software,high-quality analysis tools were chosen and integrated into the analysis process to create a set of ideal phage analysis processes.The 75 phages were downloaded from the NCBI Viral Genomes Database from the National Center for Biotechnology Information for software testing.The genome sequence of 6 strains of plague phage Heqing was obtained by isolation,culture,purification,nucleic acid extraction and sequencing,and the genome was evaluated as part of the phage genome analysis procedure.Results: 1.The methodologies and tools for phage genome assembly,annotation,and t RNA prediction were primarily introduced in the literatures,according to a statistical analysis of 90 published literatures on phage genome from 2017 to 2022.Annotation approach is the most involved in the literature,with a full explanation in 88.67% of papers.The distribution of genome assembly software is relatively dispersed,with the top three packages being SPAdes,CLC Genomics Workbench,and SOAPdeovo.The top three annotation tools for genome annotation in the literature were manual annotation,RAST,Gene Mark,and Gimmer 3.02.All of the literatures were analyzed using the t RNAscan-RE analysis software,which is utilized in t RNA prediction and annotation.2.In the phage assembly tool,SPAdes and SOAPdenovo 2 are thoroughly contrasted.In comparison to SOAPdenovo 2,SPAdes had a better assembly effect with more references,and all six scaffoldings were constructed,which was adequate for phage genome assembly.Phage Term was used to forecast and identify terminal enzymes.Four of the six phages were annotated as having 5’terminal enzymes,but Phage Term could not recognize terminal enzymes in novel phages HQ12 and HQ17.Although Glimmer is not the greatest in terms of literature impact factor and citation times,it clearly outperforms the other four prediction tools in terms of forecasting the number and length of phage open reading frames,and is more ideal for phage genome open reading frame prediction.ARAGORN and t RNAscan-SE 2.0 exhibited nearly identical effects on t RNA annotation,and the newly released t RNAscan-SE 2.0 was chosen as the analytical tool to predict t RNA in the analysis process.Pha GCN,Cherry,and Pha TYP were employed in the personalized phage analysis to predict phage taxonomy,host,and lifecycle,respectively.The Comprehensive Antibiotic Resistance Database(CARD)and the VFDB database should be used to screen for antibiotic resistance and virulence genes.3.The relatively complete genome information of 6 plague phages of Heqing was obtained through the process of phage genome analysis,which provided theoretical basis and technical guidance for the prevention and control of plague infection with plague phages in the next step.Conclusion(s): The construction of the phage genome analysis pipeline was developed according to the criteria of high throughput sequencing of the viral genome and the principles of ideal phages for therapeutic use.The feasibility and practicability of this process have been demonstrated on six new Heqing plague phage genomes and 75 published phage genomes.After preliminary genome screening,these six phages were not suitable for the prevention and control of plague in natural foci.At the same time,the analysis procedure is being set up on the National Microbiology Science Data Center’s network analysis platform for researchers to use.This analysis procedure can be used as a minimum guideline for genomic safety analysis of phage in clinical application. |