| Objective:To investigate the composition and characteristic changes of the enterovirus com munity in breast cancer patients,and explore the potential association between entero viruses and breast cancer subtypes,we conducted a viral macrogenomic study on the stool samples of 26 breast cancer patients and 25 healthy checkups from the First Affi liated Hospital of Nanchang University.Methods:1.After sample collection,the virus-like particles(VLPs)were extracted from the processed samples.The suitable methods were selected for different virus genome ty pes to construct macro-viral genomic libraries,which were qualified by quality contro l and then sequenced with Illumina platform.2.The obtained viral reads were compared with the viral database and annotated by BWA software.Then,the difference analysis and diversity analysis were performe d for the annotated species.Finally,we analyze the correlations between bacteria,met abolic pathways and clinical phenotypes.Results:1.Based on the Illumina MiSeq sequencing platform,we obtained a total of 582.7Gb,1942474022 Raw reads from 51 fecal samples by high-throughput sequencing.Among the 1452560548 clean reads,98669201(6.79%)could be compared to the viru s reads.2.11 families of viruses were detected from 51 samples,namely: Microviridae a ccounted for 37.92%,Podoviridae accounted for 16.34%,Siphoviridae accounted for 15.46%,undifferentiated viruses accounted for 12.48%,Inoviridae accounted for 9.39%,Ackermannviridae accounted for 2.63%,Myoviridae accounted for 2.59%,Geno moviridae accounted for 1.92%,Herpesviridae 0.88%,Anelloviridae 0.28%,and Retr oviridae 0.04%.Microphage family accounted for a higher proportion in both Con an d BC groups,being the dominant viral family with 39.60% and 36.30%,respectively.3.The Shannon index was calculated by Alpha diversity based on the RPKM val ues of individual sample viral sequences,and the homogeneity of sample species was counted,and there was no significant difference in the Shannon index between the tw o study groups(P=0.69),indicating that there was no significant difference in the dive rsity of viruses between the two study groups.The results of Beta diversity analysis s uggested high intra-group and inter-group similarities but no significant distinction be tween the two groups.4.We annotated the assembled 31,874 viral contigs with species,and a total of 313 genera and 2655 species were annotated;The predicted and annotated functional g enes were compared with gene homology from the KEGG gene database,and the hig hest abundance of viral genes at the Level2 level was concentrated in functions relate d to protein families;at the Level3 level,226 metabolic pathways were annotated.5.Linear regression analysis(LDA)was used to estimate the effect of the abund ance of each viral contig on the differential effect.The results suggested that at the ge nus level,Brussowvirus,Gemykibivirus,Bendigovirus,Phicbkvirus,Phikzvirus,and Pbunavirus were significantly more abundant in the healthy control group than in the breast cancer case group;while in the breast cancer case group,the Bongovirus,Casa dabanvirus,Gemykrogvirus,Poushouvirus,Wbetavirus,Gemycircularvious,and uncl assified viruses in Anelloviridae(family of fingerprint viruses)were significantly mor e abundant than the healthy control group.At the species level,the following 12 virus es were significantly different in abundance among the healthy controls: Roseburia_p hage_Jekyll,Streptococcus_phage_Javan320,Podoviridae_sp_ctcf755,Clostridium_phage_HM2,Streptococcus_phage_P7951,Stenotrophomonas_phage_Smp131,Strepto coccus_phage_Javan374,Streptococcus_phage_Str_PAP_1,Lactococcus_phage_98104,Staphylococcus_phage_Terranova,Pseudomonas_phage_vB_PaeM_SMS29,Strepto coccus_phage_Javan336;The viruses with significantly different abundance in the br east cancer case group were the following three: gemycircularvirus,Stenotrophomona s_phage_S1,and Torque_teno_virus.6.The Mann-Whitney rank sum test(Mann-Whitney U test)was selected to calc ulate the P-value of the relative abundance of each virus-related sequence between the two groups.We calculated the P-values of a total of 31,874 virus contigs,of which th e differences were statistically significant for 1547 virus contigs;313 genera annotate d by viruses,of which the differences were statistically significant.Among the 313 ge nera annotated by viruses,there were 16 genera with statistically significant differenc es,and among the 2655 species annotated by viruses,there were 136 species with stat istically significant differences.Conclusions:1.The composition of the enterovirus community in breast cancer patients was i nitially understood using viral macrogenomics.2.Diversity analysis showed rich diversity and similar composition of the entero virus community in breast cancer cases and healthy controls.3.This study used viral metagenomics to explore the differences in enteroviruses in breast cancer patients and healthy controls.This study not only enriched the existi ng studies on the roles of enteroviruses in breast cancer,but also laid a theoretical fou ndation for the identification of breast cancer-associated viruses and the prevention an d control of potential new viruses. |