| Species in the Bos genus under the Bovinae subfamily include Bos taurus,Bos indicus,gaur,gayal,banteng,yak,wisent and bison.They have a wide variety of breeds distributed around the world,and their genomes contain rich genetic variations.Structural variation(SV)is an important source of genomic diversity,which has a much greater impact on the genome than Single nucleotide polymorphisms(SNP)and is more likely to cause disruption to the genome,thereby affecting gene expression and changes in phenotype.Recent studies have demonstrated that graph pan-genomes constructed from multiple sources of genomes can capture genetic diversity and reveal the complexity of genomes.In this study,we constructed a graph pangenome of bovine animals based on 16 published third-generation bovine genomes and 10 Chinese indicine genomes.A comprehensive characterization of the variations captured by the pan-genome was conducted,followed by the application of the pan-genome in the genotyping of structural variations in 152 NGS domestic cattle samples.The main results of the study are as follows:1.In this study,we used the genomes of 9 Bos taurus cattle,12 Bos indicus cattle,2 domestic yak,1 gaur,1 American bison and 1 wild yak to construct a graph pan-genome of Bos genus.We identified 288 Mb of sequences not included in the reference genome,with sequences larger than 50 bp accounting for 172 Mb.Through de novo gene prediction on the nonreference sequences,we identified 1,887 novel genes,with an average gene length of 5.4 Kb and an average coding region length of 931 bp.Functional enrichment analysis revealed that these genes are associated with biological activities and immunity.2.The graph pan-genome captured 71.5 million small variations(<50 bp)and 216,958 SVs,with small variations mostly existing as biallelic variants,while SVs exist in multiallelic forms,representing the diversity of genomic variations in bovine species.Furthermore,the graph pan-genome could capture the majority of SVs based on linear reference genome alignment methods.Annotation of the repetitive sequences within the SVs revealed that approximately 123,000 SVs annotated to transposable element repeats,with the majority being long interspersed repetitive elements.Further annotation of gene function within the SVs revealed that only 2,024 SVs fell outside the exonic regions,indicating strong negative selection on SVs in the coding regions.Additionally,SVs tend to be enriched at the ends of chromosomes that are rich in repetitive sequences.3.Using graph pan-genomics based on domestic cattle populations,SV genotyping was performed on 152 whole-genome resequencing samples,resulting in the identification of approximately 120,000 high-quality SVs.Through the application of principal component analysis,phylogenetic analysis,and admixture analysis,these 152 domestic cattle were categorized into six major cattle populations worldwide.Further selective analysis was conducted on taurus cattle and taurine cattle.In this study,it was found that the intronic deletion in the DNAJC18 gene may be associated with the environmental adaptability of tumor cattle.Additionally,an upstream insertion mutation in the IFIT3 gene,which is related to immune disease resistance in tumor cattle,was identified.Furthermore,a series of structural variations were discovered in genes associated with traits such as growth,production,and reproduction..The super pan-genome covering multiple bovine species provides insights into the genetic diversity of bovine species and serves as a valuable reference for genomic research and breeding directions in cattle. |