Font Size: a A A

Transcriptome Sequencing In Garlic (Allium Sativum) And Functional Analysis Of Tht AsNF-YB3

Posted on:2014-01-07Degree:DoctorType:Dissertation
Country:ChinaCandidate:X D SunFull Text:PDF
GTID:1263330425978513Subject:Vegetable science
Abstract/Summary:PDF Full Text Request
Garlic (Allium sativum L.) is one of the most widely used cultivated Allium species; itsclose relatives include onion, chives, leek and shallot. Having been cultivated for more than5,000years, garlic has been used since ancient times. Garlic bulbs are used as a popularcondiment, whereas the leaves and young inflorescences are consumed as green vegetables.Garlic is not only used as a spice or as a food, but also for the treatment of many diseases. Atharvest, the garlic bulbs are usually dormant. The length of dormancy depends on both thestorage conditions and the genetic background of the cultivar. During post-harvest storage,dormancy gradually diminishes with the beginning of inner sprout growth. The sprouting ofgarlic bulbs during storage is a major factor limiting storage life, as it leads to a reduction ofvegetable quality, loss of dry matter, and onset of disease. Anumber of studies have identifieda set of differentially expressed genes, and several of these genes are specifically involved inthe dormancy maintenance and sprouting of vegetative buds. However, the molecularmechanism of dormancy maintenance and sprouting in Allium crops is poorly studied andremains largely unknown.The nuclear genomes of many Allium species are particularly large among eukaryotes.The nuclear genome of garlic slightly smaller than onion and32and6times larger than riceand maize, respectively. Due to its large genome, little genomic information is available inAllium species. As of April2012, there are fewer than300Allium sativum nucleotidesequences at NCBI and only4,752ESTs at the garlic EST database.Nuclear factor-Y (NF-Y) is a heterotrimeric transcription factor composed of threedistinct subunits, NF-YA, NF-YB and NF-YC. The NF-Y protein has been found to bindspecifically to CCAAT motifs in many promoter sequences. Each NF-Y subunit containshighly conserved domains among animals, yeast and plants based on their molecularphylogeny. Among three NF-Y gene families in plants, NF-YB genes were shown to playvarious roles depending on genes.1. In this study, we applied Illumina sequencing technology to characterize the inner budtranscriptome of garlic. Approximately26.67million90bp paired-end clean reads wereachieved in two libraries. All of the clean reads were assembled de novo using theSOAPdenovo program, producing631,087and820,849contigs from each group. Then all ofthe reads were mapped back to contigs, and, with the paired-end information, these contigs were joined into129,724and142,984scaffolds, respectively. Paired-end reads were usedagain for the gap filling of scaffolds to obtain sequences with the least Ns that cannot beextended on either end. Such sequences are defined as unigenes. Removal of the partialoverlapping sequences yielded127,933unigenes from the two libraries. Of these,20,765unigenes were≥500bp, and2,793were≥1,000bp.2. All of the unigenes were compared with the sequences in public databases, includingthe NCBI non-redundant protein (Nr) database, the NCBI Clusters of Orthologous Groups(COGs) database, the Swiss-Prot protein database, and the Kyoto Encyclopedia of Genes andGenomes database (KEGG), using the BLASTX algorithm with an E-value threshold of10-5.A total of47,095unigenes had significant hits (E-value <105) to the sequences in the abovedatabases. Our results also showed that69.95%of the unigenes over500bp in length hadBLAST matches, whereas only24.86%of the unigenes shorter than300bp did. Of all of theunigenes,45,286and29,514unigenes showed significant similarity to known genes in Nr andSwiss-Prot database, respectively.3. Unigenes are aligned to the GO database. Putative functions were assigned to10,840unique sequences involved in the categories of biological process, cellular component andmolecular function [Figure1]. As for the cellular component, the most two common types ofgenes were localized to the plastids and mitochondrion. The functions of the identified genescover various molecular function categories, and the well-represented categories includedtransferase activity, nucleotide binding, hydrolase activity and protein binding. The sequencesencoded a broad set of transcripts represented within the biological process category. Amongthese, the protein metabolic process, nucleobase-containing compound metabolic process,localization and biological regulation were well represented.4. Unigenes are aligned to the COG database. Atotal of21,952sequences were assignedto the COG classifications. Among the25COG categories, the cluster for general functionprediction only (3,164;14.41%) represented the largest group, followed by transcription(1,945;8.86%), posttranslational modification, protein turnover and chaperones (1,888;8.60%), replication, recombination and repair (1,877;8.55%), translation, ribosomal structureand biogenesis (1,691;7.70%), signal transduction mechanisms (1,285;5.85%) andcarbohydrate transport and metabolism (1,187;5.41%).5. A total of20,706unigenes demonstrated sequence similarities to the genes in theKEGG database. The largest group, the metabolic pathways, was well represented among the10,242Allium sativum unigenes. Those pathways related to genetic information processingwere the second largest group, with a majority of the proteins involved in transcription (1,742), translation (859), folding, sorting and degradation (1,120). The third largest groupcomprised organismal systems, including those genes involved in environmental adaptation(1,391) and the immune system (67). Pathways related to cellular processes andenvironmental information processing were also well represented by the unigenes from Alliumsativum.6. We detected a significant change in the expression of45,363transcripts among thedormant and sprouting garlic bud libraries. There was a more than2-fold increase in theexpression of22,836unigenes in sprouting garlic buds compared with those in dormant buds(up-regulated unigene), and the expression of22,526unigenes was down-regulated.7. Using RT-PCR, cDNA of AsNF-YB3gene was isolated from garlic. The AsNF-YB3has an open reading frame of633bp and encodes a protein of211amino acids. The proteincontained a typical HAP-domain in the N-terminal region. The deduced amino acid sequenceof AsNF-YB3had a moderate degree of homology with those of other NF-YB proteins fromvarious biological sources. AsNF-YB3expression was detected in all tissues tested, includingleaves, roots, flower buds, and flowers. Transgenic tobacco plants were developed by theover-expression of AsNF-YB3gene via Agrobacterium-mediated transformation. Thetransgenic lines were confirmed by PCR amplification using AsNF-YB3specific primers.8. To evaluate the role of AsNF-YB3in seed germination, seeds from WT and transgenictobacco plants overexpressing AsNF-YB3were grown in culture media. After8d ofgermination,the roots of the overexpression lines were significantly longer than wild-typeplants. The number of leaves, plant height, shoot fresh weight and dry weight were detected atthe4rd week after planting. The average height of all these transgenic plants was statisticallyhigher than that of wild type during development and the transgenic plants also had moreleaves than the control plants. A reduction in time to flowering was also observed in thetransgenic tobacco.
Keywords/Search Tags:Allium stativum, Illumina sequencing, Transcriptome, Transgenic tobacco, AsNF-YB3
PDF Full Text Request
Related items