| The legume species Astragalus sinicus(Chinese milk vetch,2n = 2x = 16)(CMV)is the most important green manure/cover crop in southern China.Growing CMV in rice soil during the winter fallow season can substitute for a large part of mineral nitrogen fertilizer and maintain high rice yields as well as take full advantage of the natural resources(like light and heat)and improve ecological environment.The utilization of CMV has been considered as an effective management practice for establishing the sustainable green manure/cover crop-rice pattern in southern China.Moreover,CMV can also serve as a high-quality forage grass,a honey source,a wild vegetable and one important Chinese traditional medicine.Currently,CMV genetic and symbiotic nitrogen fixation(SNF)studies have lagged behind other legumes due to the lack of reference genome.Here,we have generated the first chromosome-scale reference genome of CMV by a combination of Pac Bio,Illumina sequencing and Hi-C technologies.Comparative genome analysis sheds light on the genetic basis of nodulation and symbiosis in CMV.The nodule transcriptomics analysis of different nitrogen fixation efficiency varieties revealed the mechanism of efficiency differences.The main results and innovative findings were as follows:(1)Genome de novo sequencing,assembly and annotation.The final assembly comprised 595.52 Mb of sequences,accounting for 95.26% of the estimated genome size,with a contig N50 size of 1.41 Mb.The chromosome-scale assembly was based on Hi-C reads and a total of 575.78 Mb of sequences with a scaffold N50 of 78.42 Mb were anchored into eight pseudochromosomes(2n=16),accounting for96.66% of assembled sequences.The base accuracy and completeness of the assembled CMV genome were validated.The mapping rate was 97.33% and the genome coverage was 98.37%.BUSCO showed that 91.10% of the 1440 single-copy plant orthologs were complete,and CEGMA showed that the assembled genome completely covered 238(96.77%)of the 248 core eukaryotic genes(CEGs).The HiC heat map showed that interactions within chromosomes were more frequent than interchromosomal interactions,suggesting that the Hi-C assembly was of high quality.Collectively,these data indicated that our genome assembly was of high quality and coverage.In total,59.84% of the CMV genome was repetitive sequences,and 34,253 protein coding genes were annotated,91.50% of which were functionally annotated.(2)LTR insertion and genome size expansion.96.97% repetitive sequences were transposable elements(TEs),occupying 58.03% of the genome.Most TEs belong to LTR,making up 45.52% of the genome.Moreover,we found a strong correlation between genome size and the total length of LTR in several legume species(spearman correlation coefficient r = 0.83),indicating a role of LTR in genome size expansion in these legumes.The numbers and total lengths of LTR were amplified in CMV compared to G.uralensis,C.arietinum,M.truncatula and M.sativa genomes and were under a dramatic burst in0.5MAY,which may lead to the genome size expansion.(3)The genetic basis of nodulation and symbiosis in CMV.The phylogenetic tree showed that CMV and its closely relatives in IRLC clade were clustered into one monophyletic group.CMV diverged from C.arietinum 19.1 MYA,after the divergence of G.uralensis approximately 26.4 MYA.CMV underwent the whole genome duplication(WGD)event shared by papilionoideae species and this WGD event retained the symbiosis and nodulation gene.An analysis using CAFE showed 456 expanded and 164 contracted gene families in CMV compared to the common ancestor of CMV,C.arietinum,P.sativum,T.pretense,M.truncatula and M.sativa.The expanded gene families of CMV were enriched in flavonoid biosynthesis according to KEGG pathway analysis.Moreover,the chalcone synthase(CHS)gene family,the rate-limiting enzyme in flavonoid biosynthesis was expanded and expressed primarily in the root of CMV,thus may provide the spatially essential flavonoid signaling in the root for successful nodulation.The enrichment of R gene in the root system may enhance the defense in the root.The expression pattern of R genes in the root and nodlue was different in CMV.The enrichment of R gene in the root indicated enhanced defense in the root to cope with immune suppressive effect caused by rhizobia infection.(4)The nitrogen fixation ability of main CMV varieties and mechanism of nitrogen fixation efficiency difference in CMV.The CMV varieties showed difference in nitrogen fixation ability and local varieties like Yijiangzi and Ningbodaqiao were higher than that of cultivated varieties like Xinzi No.1.The dry weight,nitrogen accumulation,number of effective nodules,and fresh weight of nodules in Yijiangzi and Ningbodaqiao were significantly higher than these in Xinzi No.1 when inoculated with 4re-screened rhizobia strains.Highly expressed genes in nodules mainly included leghemoglobin,NCR peptides and late nodulin,and these genes play a key role in nitrogen fixation of nodule.Carbohydrate metabolism pathway-related genes were up-regulated but lipid metabolism pathway-related genes were down-regulated which could provide more carbon source for rhizobia in the nodules of Yijiangzi.Most of the genes related to bacteroid development and symbiotic metabolism were up-regulated in nodule of Yijiangzi which could promote the efficiency of bacteroid.In summary,the whole genome sequence of CMV were achieved for the first time which provides a valuble genetic information resource for molecular breeding and genetic improvement in CMV.The genetic basis of symbiotic nodulation and the mechanism of nitrogen fixation efficiency difference were first revealed which sets the theoretical basis for SNF studies in CMV. |