Font Size: a A A

Whole Genome Sequecing And Analysis Of Functional Genes In Salix Suchowensis

Posted on:2018-10-27Degree:DoctorType:Dissertation
Country:ChinaCandidate:X G DaiFull Text:PDF
GTID:1313330566450002Subject:Tree genetics and breeding
Abstract/Summary:PDF Full Text Request
Genome sequencing is essential for studies on gene function,gene regulation,map based cloning and so on.The whole genome sequencing of Populus trichocarpa leads to the emergence of poplar as the model plant for different aspects of genetic studies on woody plants.However,poplar possesses biological characteristics of long life span and gigant body size,which hamper the genetic studies by using poplar as a model system.S.suchowensis is a native species,which normally achieves sexually mature within one year.It also possesses many characteristics adorable for genetic studies,such as small body size,easy to asexual reproduction,and easy to carry out large-scale field experiments.In this dissertation,we sequenced the genome of S.suchowensis,and the main results were as follows:1.The genome of S.suchowensis is sequenced on the Roche-454 platform by adopting the shotgun whole genome sequencing strategy.Sequence reads for genome assembly contain 16.5-fold of 454 single reads,7.3-fold of mate-pair reads,and 231.1-fold of Illumina pair-end reads.The obtained genome assemblies cover 71.5% of the whole genome with a total length of 303.2 Mb.The scaffold and contig N50 size of the assembly were 925.0 kb and 17.4 kb,respectively.Sequencing depth distribution showed that over 94.0% of the assembly was covered by more than 20 coverage.The assembly covered 94.2% of Salix EST and 97.6% of the assembled unigenes from the sequenced individual.The coverage of the core eukaryotic genes was estimated to be 97.8 for the S.suchowensis assembly.All of these statistics revealed that our genome assembly has high contiguity and accuracy.2.Totally,26,599 protein-coding genes were predicted based on the de novo,homology and RNA-seq dependent methods.Blasted the gene sequence with the protein database of nr,Swissprot and TrEMBL revealed that 25,871 genes have homology sequences in at least one of the protein database,which accounting for 97.3%of the total gene.A combination of de novo and homology-based comparisons resulted in the identification a total of 125.6 Mb(41.4%)nonredundant repetitive sequences within the assembled S.suchowensis genome.In addition to the protein-coding genes,we also identified 828 tRNAs,161 rRNAs,598 snRNAs,and 269 microRNAs in the S.suchowensis genome.3.The genome size of S.suchowensis was estimated to be 429.3 Mb which are in concordance with the estimation by 17K-mer analysis based on 40 coverage of high quanlity Illumina reads.S.suchowensis and P.trichocarpa are sister genera,but the genome size of S.suchowensis is about 60 Mb smaller than that of P.trichocarpa.This willow was found to have a smaller genome size and lower gene content than that of P.trichocarpa,suggesting more extensive loss of DNAs and genes in willow than in poplar during the divergent process of these two lineages.4.Comparative genomic studies revealed that the assembled genome was highly collinear to the poplar genome.These results directly supported that both genera shared ‘Salicoid' whole genome duplication events in their evolutionary history.This WGD is estimated to have occurred around 58.08 ± 0.11 million years ago.The divergence between Salix and Populus took place 6 million years after the appearance of the crown of their progenitor.5.Comparisons of gene expansions between willow and poplar suggested that fractions of genes associated with WGD or segmental duplications were higher in poplar than in willow,suggesting that poplar retained more ‘Salicoid' duplicates than willow after their divergence.Even in the families with more genes in willow,these genes were rarely associated with WGD or segmental duplications,but mainly derived from expansions through lineage-specific tandem duplications and transposons.6.Using 1,232 single-copy orthologous genes of Salix,Populus,Arabidopsis,Vitis and Oryza and setted calibration points,we estimated the mean substitution rates of 1.09×10-9 and 0.67×10-9 /site/year,for S.suchowensis and P.trichocarpa,respectively.The result also revealed that woody plants have lower substitution rates that herbaceous plant,long generation time of woody species also have lower substitution rates than early-flowering species.7.The trascriptome of female flower with different polination time were sequenced based on Illumina Hi-Seq 4000 platform.32,367 unigenes were assembled in a total length of 53.6 Mb,with the N50 length of 2,275 bp.Thereafter,we analyzed the putative microsatellites in the obtained and unigenes,and developed a SSR primer database from the transcriptome sequences.8.By comparing the expression profiles of the female flower,927 differentially expressed genes were identified.GO enrichment analysis revealed that 257 up-regulated genes after polination of 144 h are mainly enriched in the molecular function of glucosyltransferase activity,calcium ion binding and ADP binding.Calcium ion is crucial for fiber initiation in cotton.The identified enriched genes provide essential information for regulation the seed hair development of S.suchowensis in future studies.
Keywords/Search Tags:S.suchowensis, genome sequence, genome evolution, substitution rate, transcriptome, differential expressed gene
PDF Full Text Request
Related items