Font Size: a A A

Whole-genome Sequencing Of Cultivated And Wild Peppers Provides Insights Into Capsicum Domestication And Specialization

Posted on:2015-03-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:C TanFull Text:PDF
GTID:1223330482475334Subject:Crop Genetics and Breeding
Abstract/Summary:PDF Full Text Request
As an economic crop, pepper satisfies people’s spicy taste and has medicinal uses worldwide. To gain a better understanding of Capsicum evolution, domestication and specialization, we present here the genome sequence of cultivated pepper Zunla-1 (Capsicum annuum L.) and its wild progenitor Chiltepin (C. annuum var. glabriusculum).1. Large genome assembly and chromosome anchoringUsing the whole genome shotgun approach, we generated a total of 325 Gb and 205 Gb high quality reads from various Illumina sequencing libraries for Zunla-1 and Chiltepin, respectively. As expected, the genome size of Zunla-1 was estimated to be 3.26 Gb, which is slightly larger than the 3.07 Gb of Chiltepin by K-mer analysis; estimations consistent with a previous report. Short sequencing reads, corresponding to 99-fold and 67-fold genomic depth, were hierarchically and iteratively assembled into contigs with an N50 length (50% of the genome is in fragments of this length or longer) of 55 Kb and 52 Kb for Zunla-1 and Chiltepin, respectively. Pair-end information was used sequentially in assembler SOAPdenovo to generate scaffolds comprising 3.48 Gb and 3.35 Gb scaffolds with an N50 length of 1.23 Mb and 445 Kb, respectively. In our following analysis, we would refer to the Zunla-1 assembly as a reference for the C. annuum genome.The scaffolds were then anchored to 12 linkage groups by 7,657 SNP markers in our newly developed high-density genetic map, which could be assigned as Chr01-Chr12 according to the cytological analysis. The pseudo-chromosomes consist of 4,956 scaffolds with 31,201 genes located, corresponding to 79% of the reference. It has been reported that during domestication chromosome translocation events differentiate cultivars from wild progenitors, which helped us to precisely anchor 29,081 scaffolds (2.42 Gb; 30,123 genes) of Chiltepin to chromosomes.2. Repetitive elements and genome expansionWe found that more than 81%(~2.7 Gb) of the pepper genomes were composed of different transposable elements (TEs), which is significantly higher than that (~61%) in potato and tomato. Most of the plant TE categories were identified in pepper, including 70.3% long terminal repeat (LTR) retrotransposons and 4.5% DNA transposons. Clearly, LTR retrotransposons contributed more to the genome expansion than those in potato (47.2%), tomato (50.3%) and grape (46.2%), which parallels to the genomic topology of the maize genome (75%). The most abundant LTR retrotransposons were the Gypsy clade (54.5%) followed by Copia (8.6%). In the TEs identified,23.1% and 16.2% are ancestral repeats that predate the divergence of pepper with tomato and potato, respectively, whereas other lineage specific TEs emerged during the genome expansion and account for 50.8% of the pepper genome.To investigate the genome expansion event in pepper, we dated the insertion time of all LTRs based on divergence analysis. A peak of increased insertion activity was found-0.3 million years ago (Mya), suggesting that the expansion of the pepper genome was quite recent during the evolution of the solenaceae family. Analysis of the insertion time and phylogenetic topology of Copia and Gypsy clades also supported this conclusion. Obviously, Gypsy had the highest insertion activity recently after Solananceae species divergence, which made it the most abundant in pepper genome.3. Gene annotation and transcriptionTo facilitate gene annotation, we generated 90.5 Gb of RNA sequencing (RNA-Seq) data from 30 libraries representing all primary developmental stages and tissue types, including various fruits. A combination of evidence-based and de novo approaches predicted 35,336 and 34,476 high-confidence protein-coding loci in the reference and Chiltepin genomes, respectively; over 90% of predicted genes were supported by ESTs, RNA-Seq entries, or homologous proteins. Gene density is relatively low surrounding centromeres where the TEs are inversely high, indicating that the repetitive sequences are unevenly scattered along chromosomes.We also obtained 2,717,180 unique tags by sequencing the flower buds and identified 6,527 long noncoding (lnc-) RNAs by a self-developed program. Among lnc-RNAs 5,976 are intergenic,222 are intron-overlapping and the others are bidirectional. Sequencing of small RNAs from 5 different tissues allowed the identification of 5,581 phased short interfering RNA (siRNA). Based on the plant microRNAs (miRNAs) miRBase database, a total of 176 miRNAs were discovered in pepper and classified into 64 families. Comparison with miRNAs of other Solanaceae members and plant species showed that 141 miRNAs are conserved and 35 ones are specific to pepper. We predicted 1,104 target genes for these miRNAs, of which 78% have putative functions. Significantly, about half of the pepper miRNA families potentially play an important role in post-transcriptional regulation by targeting mRNAs encoding transcription factors.4. Insights into Solanaceae evolutionSequence-based analysis of pepper gene families was conducted using OrthoMCL in comparison with those in tomato, potato and Arabidopsis. We identified 10,279 gene families shared among the four species and a total of 17,671 in pepper with more than one orthologous genes. Another 1,257 gene families, containing 3,143 genes, were specific to the pepper genome. These pepper-specific genes that have various biological functions, however, they are particularly overrepresented in the GO category of biotic stimulus, indicating the pepper has rapid and strong response to better face fluctuating environmental conditions.A total of 5,231 single-copy orthologous genes identified in grape, papaya, pepper, tomato, potato and Arabidopsis were used to construct a phylogenetic tree. It showed that pepper separated from tomato and potato ~36 million years ago (Mya), during which the Capsicum genus evolved in Solanaceae. We also observed that Solanaceae appeared nearly 156 Mya, very soon after the differentiation of monocots from dicots.In the pepper genome, we identified 1,052 and 799 large syntenic blocks involving 12,601 and 10,596 genes when compared with tomato and potato, respectively. However, 612 and 430 chromosomal translocation events occurred during the divergence of Capsicum relative to tomato and potato, respectively. These translocations are distributed extensively on all pepper chromosomes providing evidence for generalized chromosomal rearrangements. Meanwhile,468 and 367 inversions were identified in pepper when compared to tomato and potato, respectively. In addition, comparison with the grape genomes revealed that a whole genome triplication happened in the pepper genome, suggesting a common event among the Solananceae. Considerable gene loss of one or two copies of duplicated genes occurred after the triplication, resulting in few remaining triplicated genes in the pepper genome.We then calculated the time of whole genome duplication (WGD) events in Solanaceae lineages based on the distribution of distance-transversion rate at four-fold degenerate sites (4DTv methods) of paralogous gene pairs. A peak at around 0.48 and 0.1 elaborated that the ancestral pepper-grape and pepper-tomato divergence occurred ~89 and 20 Mya, respectively, which are consistent with the phylogenetic analysis. The peak at ~0.3 proved a recent WGD in the ancestral pepper-tomato lineage. As observed, there is no evidence of Capsicum-specific WGD after the pepper-tomato/-potato divergence, again confirming the notion that proliferation of TEs primarily contributed to pepper genome expansion.5. Molecular footprints of artificial selectionArtificial selection, involved in two breeding processes of early domestication and modern intensive improvement, played an important role in the origin of cultivated peppers. We selected 18 cultivated accessions representing the major varieties of C. annuum and 2 semi-wild/wild peppers for whole-genome resequencing. After alignment of the sequencing reads corresponding to 10-30 fold depth to the reference, we identified an average of 9,826,526 single nucleotide variations and 237,509 small insertions/deletions As expected, the wild accessions possessed higher genetic diversity than the cultivars. The neighbor-joining tree and population structure further revealed that the wild and domesticated peppers are genetically distinguishable at an overall genomic level.We next scanned the genome of these accessions to identify genome-wide signatures of artificial selection using genetic bottleneck approach. We identified a total of 115 regions with strong selective sweep signals in the cultivated peppers (85.2 Mb, or 2.6% of the genome and containing 511 genes). The length of these selected regions ranged from 0.3 kb to 61.9 kb and the polymorphism levels of these selected regions relative to the whole genome were relatively low, indicating that these regions appeared to have been affected by selection during domestication.In total,511 genes embedded in selected regions for domestic peppers were related mainly to transcription regulation, stress and/or defense response, protein-DNA complex assembly, growth and fruit development. Of these,34 transcript factors (TFs) including AP2, ERF, bHLH families, and 10 disease resistance protein containing NB-ARC domain were identified.6. Comparison of fruit development between pepper and tomatoThe ripening process greatly influences fruit quality and shelf life and differs significantly between climacteric fruits such as tomato and non-climacteric fruits such as pepper, which have a slower softening process and no response to ethylene. We compared gene expression profiles between tomato and pepper during fruit ripening. Tomato had 2,281 differential genes whereas pepper had 1,440, including in both cases genes involved in cell wall remodeling, hormone signaling and metabolism, carbohydrate metabolism, protein degradation, and abiotic stress responses. However, important differences were identified.7. Evolution of genes involved in capsaicin synthesisCapsaicinoid accumulation, which mainly consist of capsaicin and dihydrocapsaicin, is exclusive to Capsicum and responsible for the fruits’pungency. Based on previous studies on pepper pungency, we identified 51 gene families involved in capsaicinod biosynthesis in pepper and their orthologs in tomato, potato and Arabidopsis. Phylogenetic analysis showed that pepper had independent pepper-specific duplications in 13 gene families compared with the other three species (such as ACLd, AT3, fi-CT, C3H, CAD, CCR, Kas I and PAL gene families. The sequence divergence among gene duplications could have led to diverged functions or neofunctionalization, promoting the evolution of specialized capsaicinoid biosynthesis. Taking AT3 as an example, we identified three tandem copies of At3 (Pun1) gene in pepper, which encodes a putative acyltransferase and acts as regulator of pungency in certain Capsicum spp.. Both AT3-D1 and AT3-D2 in wild and cultivated peppers have an amino acid substitution (K390R) in conserved DFGWGKP motif. Analysis of AT3-D1 indicated that pun1 allele (C locus) had a 2,724/2930-bp deletion in non-pungent genotypes spanning the putative promoter and the first exon as reported previously. We also identified short insertions/deletions and non-synonymous single-base substitutions in both AT3-D1 and AT3-D2 in pungent domesticated peppers when compared to Chiltepin.Most gene families except ACL-D4 and ACL-D5, exhibited tissue-/stage-specific expressions accompanying gradual accumulation of capsaicinoids. However, CCoAOMT-D9, AT3-D1 and AT3-D2 were only significantly expressed during the fruit developmental stages, in which capsaicinoids were synthesized. We also carried out expression analysis of these expanded genes in five non-pungent peppers, which showed that the expression of AT3-D1 was either undetectable or in trace amount; this lack of expression may be caused by the large deletion in punl allele, which made it a pseudo-gene in non-pungent peppers. More interestingly, the expression of AT3-D2 could probably keep the trace amount of capsaicin and dihydrocapsaicin detected in non-pungent peppers. We conclude that dosage compensation effect by AT3-D2 (Capang02g002091) and AT3-D1 (Capang02g002092) in Locus C shaped the pungent diversification in peppers.
Keywords/Search Tags:de novo genome sequence, Capsicum, domestication, specialization
PDF Full Text Request
Related items