| Hemibagrus wyckioides(Siluriformes: Bagridae)is an important economic fish endemic to in southwest China and southeast Asia,mainly distributed in the Lancang River-Mekong River basin.Sexual dimorphism exists in H.wyckioides,and the growth rate of males is higher than that of females.H.wyckioides is a high-quality aquatic fish with fast growth speed,low oxygen tolerance,strong disease resistance and high nutritional value.However,in recent years,due to the massive construction of hydropower stations and over-fishing,there has been a sharp decrease in the number of wild H.wyckioides in the Lancang-Mekong River Basin.The development of genomics has created the possibility for artificial breeding and protection of wild populations of H.wyckioides.The acquisition of the chromosome level genome of the H.wyckioides not only provides available genomic resources for subsequent research on the population genetics,conservation biology and functional genomics of the H.wyckioides,but also promotes the development of its molecular marker-assisted breeding technology and sex-controlled breeding technology.Therefore,we sequenced and assembled the genome of the H.wyckioides at the chromosome level based on the second-and third-generation sequencing technologies,and made an evolutionary analysis based on the whole genome of H.wyckioides and preliminarily explored its functional genes.The main research contents and results are as follows:(1)Chromosome level genome assembly and annotation of H.wyckioidesIn this study,we used the H.wyckioides samples captured from the Lancang River basin in Xishuangbanna(Yunnan Province,China)for sequencing.A total of 45.89 Gb of raw data was obtained using Illumina sequencing technology.Based on the above data,the genome size and heterozygosity of the H.wyckioides were estimated.The results of K-mer analysis showed that the genome size of the H.wyckioides was about779.64 Mb,and the heterozygosity was 0.3%.Next Denovo software was used for pure three-generation assembly(Nano Pore and Pac Bio).The final genome size was 789.79 Mb,and the Contig N50 was 22.08 Mb,indicating that the quality of the assembly of the genome of the H.wyckioides was better.Besides,the chromosome level genome is constructed through Hi-C technology,most of the sequences are mapped to 29 chromosomes,representing 97.7% of all Contigs.Using strategies based on ab initio prediction,homologous species,and transcriptome data to annotate protein-coding genes.In addition,repetitive sequences and non-coding RNA are also annotated.The proportion of repetitive sequences in the genome of the H.wyckioides is 40.12%,of which tandem repetitive sequences(TRs)accounted for 2.99%,and scattered repetitive sequences(mainly TE repetitive sequences)accounted for 37.13%.A total of 22794 genes were annotated in the genome of the H.wyckioides.Through the BUSCO software,the single-copy orthologous gene set in the actinopterygii_odb10 database was used to predict gene integrity and the results showed that about 94.59% of the complete gene elements can be found in the genome of the H.wyckioides,indicating that most of the predicted conserved genes are relatively complete.Meanwhile,in terms of average CDS length,average exon number per gene,average exon length and average intron length,the distribution of H.wyckioides and its related species(such as T.fulvidraco,B.yarrelli and G.maculatum,etc.)is consistent,and the syntenic analysis based on the alignment of the whole-genome sequence shows that the genomes of the H.wyckioides and I.punctatus have a good collinearity,which indicated that the results of genome assembly and annotation had high completeness and accuracy.(2)Evolutionary analysis and preliminary study on functional genes of H.wyckioides based on whole genomeIn order to explore the phylogenetic relationship and divergence time of H.wyckioides and its related taxa,this study conducted ML phylogenetic analysis and relaxed molecular clock estimation.Ortho MCL software was used to perform gene family clustering analysis on 18 species,and all protein sequences were compared with Blastp(E-value≤1e-5).Finally,Markov model clustering algorithm was used to obtain orthologous genes,paralogous genes and single copy orthologous genes of each species.Including H.wyckioides,18 bony fishes were selected to construct ML phylogenetic trees(the L.oculatus is used as an outgroup)and estimate the divergence time based on single-copy orthologous genes.The phylogenetic analysis results showed that the H.wyckioides and T.fulvidraco,which belong to the family Bagridae,form a branch.The Siluriformes fishes is a monophyletic group,which form sister groups with the order Gymnotiformes species,and then form a paraphyletic group with Characiformes and Cypriniformes,successively,namely(Cypriniformes,(Characiformes,(Siluriformes,Gymnotiformes))).The monophyleticity of Otophysa and Siluriformes are supported.The Otophysa fishes diverged from 235.12 Ma,the Siluriformes and the Gymnotiformes diverged from their most recent common ancestor at about 118 Ma,and the divergence time between the H.wyckioides and T.fulvidraco was at about 42 Ma.In order to explore the functional genes related to important biological characteristics such as fast growth,large size,and hypoxia tolerance,the gene family clustering analysis,positive selection analysis and expansion and contraction gene family analysis were performed.The results showed that among the 18 species analyzed,there were 383 unique genes(In the gene family clustering analysis,only the H.wyckioides was clustered,while the cluster number of other species was 0).Among them,the main function of hba1(Hemoglobin subunit alpha-1)gene was to transport oxygen to various tissues of the body,which may play a role in the tolerance of the H.wyckioides to the hypoxic environment.There are 9 positive selection genes in the H.wyckioides genome,namely col4a6,elovl1,emx1,id2,mag,ndrg4-a,plekhf2,pqlc1 and tchp.Among them,col4a6,id2,mag,plekhf2 and tchp genes were related to diseases,and emx1 and ndrg4-a were related to brain and heart development.Besides,the elovl is an important gene for long-chain fatty acid biosynthesis.Compared with the other 17 species,398 gene families have expanded in the evolution of the H.wyckioides.The significantly expanded gene families are mainly enriched in the 4 types of KEGG pathways.(1)Immune-related pathway;(2)Metabolic-related gene families;(3)Growth-related gene families;(4)Environmental information processing pathway.MHC gene is an important part of vertebrate adaptive immune system,which is highly expressed in epithelial tissue and lymphoid tissue,and is responsible for recognizing and presenting antigens.Therefore,the strong disease resistance of H.wyckioides may be related to the expansion of MHC1 gene(13 copies).The Origin recognition complex4(ocr4)gene family has expanded(11 copies),and the enriched biological pathways such as meiosis and cell cycle are important pathways in the growth and development of the H.wyckioides.This gene family is necessary in the process of cell proliferation.The biological characteristics of the large size and faster growth rate of the H.wyckioides can be explained from the expansion of such growth and development-related genes. |