| Seed size is one of the key determinants of crop yield.Understanding the regulatory genes and regulatory mechanisms of seed size is beneficial for improving flax seed yield.So far,the research on seed size are mainly in Arabidopsis,rice,and maize,while the relevant reports in flax are very scarce.Classical functional gene mining needs to find the target genes through mutation screening or genetic mapping of mapping population.However,the number of related candidate genes obtained by these methods is small,and their role in genetic research of complex quantitative traits is limited.In this study,according to nearly 1,000 flax germplasms and 200 flax re-sequencing data obtained by our team,seed size-related genes from flax seeds were identified based on genomic imprinting,genome-wide association study(GWAS),and transcriptome analysis.And the preliminary functional analysis of flax seed size-related candidate genes was conducted in Arabidopsis to verify the reliability of these methods for identifying candidate genes.Our results provide important resources and genetic foundation for further research on seed size regulation and seed improvement in flax.The main foundings were as follows:(1)We identified 482 imprinted SNP loci including 253 maternally expressed imprinted SNP loci and 229 paternally expressed imprinted SNP loci in flax(Linum usitatissimum L.)endosperm at 7 days after pollination isolated from reciprocal crosses between CIli2719(C)and Z11637(Z).Mapping these imprinted SNPs to genes,we identified 248 candidate imprinted genes including 114 maternally expressed imprinted genes(MEGs)and 134 paternally expressed imprinted genes(PEGs).Among these imprinted genes,only 28 genes clustered in specific chromosomal regions to form 14mini-clusters.Imprinted genes were not well conserved among flax and other plant species.MEGs tended to be expressed specifically in the endosperm,whereas the expression of PEGs was not tissue-specific.The phylogenetic tree and principal component analysis based on imprinted SNPs separated 200 flax varieties into three different subgroups: oil flax,oil-fiber dual purpose flax,and fiber flax subgroups.The nucleotide diversity(π)of imprinted genes in the oil flax subgroup was significantly higher than that in the fiber flax subgroup,indicating that some imprinted genes underwent positive selection during flax domestication from oil flax to fiber flax.Moreover,imprinted genes that underwent positive selection were related to flax functions.Thirteen imprinted genes related to flax seed size and weight were identified using a candidate gene-based association study.(2)Morphological and cellular observations showed that the development of Z(~3.7g of 1,000-seed weight)was earlier than that of C(~10.5 g of 1,000-seed weight).And1,751 protein-coding genes were differentially expressed in early seeds,torpedo-stage embryos and endosperms of C and Z using RNA sequencing.Homologous alignment revealed that 129 differentially expressed genes(DEGs)in flax were homologous with71 known seed size-related genes in Arabidopsis thaliana and rice.These DEGs controlled seed size through multiple processes,among which phytohormone signaling pathways and transcription process were the most important.Moreover,54 DEGs were found to be associated with seed size and weight in a DEG-based association study.The nucleotide diversity analysis of seed size-related candidate DEGs identified by homologous alignment and association analysis showed that the π values decreased significantly during flax acclimation from oil to fiber flax,suggesting that some seed sizerelated candidate genes were selected in this acclimation process.(3)With a genome-wide association study of three seed size-related traits,including thousand-seed weight(TSW),seed length(SL),and seed width(SW)using 200 flax natural population,we identified 2,907 TSW,1,367 SL,and 1,191 SW significant SNPs;734 TSW,274 SL,and 232 SW QTNs;219 TSW,66 SL,and 39 SW QTLs;3,937 TSW,1,227 SL,and 691 SW preliminary candidate genes.Combined with GWAS and RNAseq analysis,four QTLs were further analyzed and multiple candidate genes such as Lus10008949,Lus10008953,Lus10008954,Lus10036370,Lus10036372,Lus10021040,and Lus10014281 were identified to be associated with seed size and weight.(4)The recombinant vectors of p CAMBIA2300-candidate genes of five seed sizerelated candidate genes identified by the above three methods were constructed,and they were heterologously expressed in Arabidopsis Col-0 by Agrobacterium tumefaciens mediated inflorescence transfection.The results showed that overexpression of the Lus10008953 and Lus10036372 resulted in smaller seeds of Col-0.Overexpression of the Lus10021040,Lus10014281,and Lus10004029 increased the seed size of Col-0.These results indicate that our methods based on genome and transcriptome for identifying seed size-related genes are reliable and can be used for further analysis. |