| Background&ObjectiveHuman Astrovirus(HAstV)was first discovered in 1975 by Appleton and Higgins in the stool samples of children with diarrhea.It was named because the surface of the virus particle present star-like under electron microscope.It is the main pathogen that causes acute gastroenteritis.HAstV-MLB2 was first reported in 2009,which has the basic structure of traditional HAstV.Hematopoietic stem cell transplantation(HSCT)patients have poor immune function and long course of disease.HSCT is susceptible to various pathogens,and the pathogens can persist in the host for a long time.Some of these pathogens,especially RNA viruses,have high mutation rates due to their rapid replication and lack of correction activity of RNA polymerase.As the patient’s immunity gradually recovers,the host exerts selective pressure on the virus,leading to the emergence of more virulent strains.Thus,HSCT patients are seen as a source pool for novel viruses.In this study,we used high-throughput sequencing analysis to understand viral metagenomics in the feces of immunocompromised patients after HSCT.Determine the genome structure of HAstV-MLB2-YJMGK and evaluate the quasispecies heterogeneity,as well as analysis genetic and evolutionary of globally popular HAstV isolates.MethodsThirty fecal samples from 10 Allogeneic hematopoietic stem cell transplantation(allo-HSCT)patients at different times were sequenced by high-throughput sequencing(MiSeq platform,Illumina).On the basis of analyzing the original sequence obtained by using the biological information analysis platform,further using the biological software such as Geneious to extract and analyze the interested HAstV-MLB2 gene sequences.The presence of HAstV-MLB2 was confirmed by polymerase chain reaction(PCR)using primers designed based on the contig sequences obtained by MiSeq high-throughput sequencing.HAstV screening was performed on all 50 fecal samples collected from allo-HSCT patients.Specific primers were designed for ORF1b region,and 702bp fragments were amplified.Positive and negative controls were set for each PCR.The astrovirus amplified in this study was named HAstV-MLB2-YJMGK.In order to obtain the whole genome of HAstV-MLB2-YJMGK,primers were firstly designed according to the sequences obtained by high-throughput sequencing,and then specific primers were further designed based on the new sequences obtained,and other sequences were further amplified.The RACE and Wlaking methods were used to amplify the 5 ’and 3’ ends of the genome.Finally,primers were designed for long fragment amplification to confirm the complete genome sequence.The sequence and gene structure of the complete genome were analyzed.Phylogenetic trees were constructed using the nucleotide(nt)sequences of ORF2 The complete genomes of known HAstV-MLB2s were downloaded from GenBank Tree figures were produced by MEGA(6.0)software,running 1000 times using the neighbor-joining method.To precisely estimate the substitution rate,a Bayesian Markov Chain Monte Carlo(MCMC)approach was implementedusing BEAST software(v.1.8.2).To detect recombination,aligned sequences were analyzed using the bootscanning method in Simplot software.To precisely estimate the substitution rate,a Bayesian Markov Chain Monte Carlo(MCMC)approach was implemented using BEAST software(v.1.8.2).jModelTest software(v.2.1.7)was used to identify the optimal evolutionary model.The Akaike information criterion and hierarchical likelihood ratio test suggested that the general time reversible(GTR)+Γ(gamma distributed rate variation)model best fitted the aequences.The results were computed and analyzed using Tracer(v.1.6).Statistical uncertainty in the data was reflected by 95%highest probability density(HPD)valuesThe dynamic model diagram of constant growth demographic population were drawn by use of the GTR+Γ+UCLD(the uncorrelated log-normal distribution relaxed clock)models in BEAST(v1.8.2).The Bayesian skyline plot was analyzed using Tracer.Diversity was quantified as the mean genetic distance calculated for all pairs of nt sequences using MEGA software(v.6).The rates of synonymous substitutions per synonymous site(dS)and nonsynonymous substitutions per nonsynonymous site(dN)were calculated using the method of Nei and Gojobori with the Jukes-Cantor correction for multiple substitutions by MEGA software.Results Through high-throughput sequencing,a total of 724757000 clean reads wasobtained from the 30 fecal samples,among which 447,230 were virus sequences(15.4%),and 852 reads matched with HAstV-MLB2.Based on the initial sequence,the nearly full-length genome sequence HAstV-MLB2-YJMGK was obtained,a total of 6131bp.Homology analysis showed that the whole sequence had 98%nucleotide homology with the known reference strain MLB2-GUP187(AB829252.1),the coverage was 100%,and the variation in ORF1b is the highest.HAstV-MLB2-YJMGK has three open reading frames(ORF1a,ORF1b and ORF2),ORF1ab encoding non-structural proteins,immunoreactivity epitope analysis of ORF1a region showed no changed in immunoreactivity;and ORF2 encoding capsid proteins,the antigenicity of HAstV-MLB2-YJMGK did not changed according to the antigen prediction of capsid protein.HAstV-MLB2-YJMGK was detected in 5 stool samples from two patients,and the interval between the first and last samples of the two patients was about 25 days The latest common ancestor of HAstV emerged about 3,800 years ago,and the effective population of HAstV has declined in the last 100 years.The historical bayesian skyline plot of the HAstV pop μ Lation shows that the HAstV pop μ Lation is relatively stable over time,but the genetic diversity of HAstV has declined over the past 100 years.Recombination analysis showed that the HAstV-MLB2-YJMGK strain did not recombine.The genetic diversity of HAstV-MLB2-YJMGK was analyzed by partial sequence of capsid proteins obtained by high-throughput sequencing,a total of 81 mutation sites were found in 210bp length sequence,the virus showed a single nucleotide polymorphism(SNP)in the patient.The substitution ratio of non-synonymous mutation and synonymous mutation(dN/dS)in the S and P regions of the capsid region was 1.667,indicating that the mutation was positive.By BEAST software analysis,the average base replacement rate of HAstV is 1.97×10-3/year,and the 95%HPD is 2.76×10-4~5.77×10-4.Conclusion We analyzed viral metagenomics in the stool of patients with allo-HSCT and revealed that there were many different viruses present in specimens from such patients.HAstV-MLB2-YJMGK capsid protein antigenicity and ORF1b region immune reactivity were not changed.The high variability of the genome ORF1b suggests that the virus may affect its own replication and proliferation through changes in RdRp activity and regulatory changes during replication.The genetic distance of HAstV-MLB2-YMGK is close to the previously reported genetic distance of HAstV-MLB2 and primate AstV,and an independent branch is formed inpopulation has remained relatively stable.In recent years,the populathe evolutionary treee.The common ancestor of HAstV was 3,800 years ago,Thetion of HAstV has been decreasing,which may be due to the improvement of economic conditions and medical level.HAstV-MLB2-YJMGK persists in immunocompromised patients with multiple positive mutations,which suggests HSCT patients may be a source of new AstV. |