Font Size: a A A

Survey Of Protein-DNA Interaction Of Aspergillus Oryzae On A Genomic Scale

Posted on:2016-03-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:C WangFull Text:PDF
GTID:1220330479495095Subject:Fermentation engineering
Abstract/Summary:PDF Full Text Request
The genome-scale delineation of in vivo protein-DNA interactions is key to understanding genome function. Only approximately 5% of transcription factors(TFs) in the Aspergillus genus have been identified using traditional methods. Although the Aspergillus oryzae genome contains more than 600 TFs, knowledge of the in vivo genome-wide TF-binding sites(TFBSs) in aspergilli remains limited because of the lack of high-quality antibodies. We investigated the landscape of in vivo protein-DNA interactions across the A. oryzae genome through coupling the DNase I digestion of intact nuclei with massively parallel sequencing and the analysis of cleavage patterns in protein-DNA interactions at single-nucleotide resolution. The detailed results in this thesis are listed below.1) Genome-scale identification of DNase I footprints from the DNase-seq libraryA. oryzae nuclei were isolated and treated with DNase I to release DNA fragments of<500bp. RT-q PCR was used to verify the sensitivity of the DNase I library. DNase I digestionof A. oryzae nuclei coupled with massively parallel sequencing was used to create a wholegenome DNase I cleavage library under conditions including nutrient-rich culture(DPYcondition) and ER-stress induction(UPR condition). DNase I cleavage sites mapped byuniquely mapped reads were confined to 30.71 and 27.10 million unique positions withinthe A. oryzae genome. Using a computational algorithm, 8125 footprints from cells grownunder DPY condition and 8894 footprints from cells grown under UPR induction conditionwere identified in the intergenic regions of the A. oryzae genome with an FDR threshold of0.1.2) De novo identification of motif sequences through genomic DNase I footprinting8125 footprints under DPY condition and 8894 footprints under UPR induction conditionwith an FDR threshold of 0.1 were extracted to assay de novo sequence motifs using MEME.MEME recovered 12 overrepresented motifs from the footprints set under DPY conditionand 11 overrepresented motifs from the footprints set under UPR induction condition, corre-sponding to known Aspergillus TFs, including Slt A, Cpc A and the E-box of b HLH factors.The mean per-nucleotide DNase I cleavage rates across each motifs were computed in thefootprint regions. Based on de novo motifs recovered from genomic footprints, we observedthat the Slt A TF-binding motifs contained two co-localization binding patterns including atleast two Slt A binding sites to regulate genes with different functions.3) Transcriptome analysis of A. oryzae under nutrient-rich culture conditionStrand-specific RNA-seq data generated under DPY conditions were used to determinegenome-wide transcription levels. A total of 26,287,168 pair-end reads of 90 bp weremapped to the A. oryzae RIB40 genome and genes. The gene expression in the A. oryzaeRIB40 genome was calculated. Based on the strand-specific RNA-seq data, specific TSSswere assigned for 5050 genes. Compared to transcriptional data under basal nutrient cul-ture condition determined before, the genes unregulated in this study were enriched in thebiological catabolic pathways.4) DNase I cleavage patterns and the distribution of digital footprints near TSSsDNase I cleavage information for-1 kb and +1 kb TSS flanking regions of 5050 A. oryzaegenes were extracted from the DNase-seq data. The pattern of the average DNase I cleavagein 5050 A. oryzae genes was organized around the TSSs in the sequential arrangement of the-1 nucleosome, the 5’nucleosome free region, the TSS and the +1 nucleosome. Further-more, K-means clustering was performed for DNase I cleavage patterns of 5050 genes at +/- 1 kb TSS flanking regions. The results revealed that the 5050 A. oryzae genes could be di-vided into four distinct clusters, which showed various gene expression level and the lengthof 5’ UTR. The number of the overrepresented footprints in the regions from the translationstart codon to 500 bp upstream of TSSs was also computed.5) Genome-scale identification and structure features of active binding sites for the known TFsWe gathered the available binding motif sequences of 19 known Aspergillus TFs and avail-able DNA–protein co-crystal structures of the five family TF homologues in Aspergillusand S. cerevisiae. The DNase I cleavage patterns of the orientation-specific motifs for 19known Aspergillus TFs, which were derived from mapping tags to the plus and minus strandsbased on DNase-seq data, showed an imbalance between sense and antisense strands withinand outside of the binding-motif sequences. The DNase I cleavage patterns at individual nucleotide positions were aligned onto the DNA backbones of the co-crystal model. The contact patterns between the charged amino acids of the monomer TF and the complex TF were imbalanced between the plus and minus motif sequences.
Keywords/Search Tags:Aspergillus oryzae, Protein-DNA interaction, Transcription factor, Transcription regulation, DNA motif, Chromatin structure
PDF Full Text Request
Related items