Font Size: a A A

Development And Application Of Process Management Platform For Chloroplast Genome Assembly Analysis In Salicaceae

Posted on:2021-02-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:D Y WuFull Text:PDF
GTID:1480306557984719Subject:Forest genomics and bioinformatics
Abstract/Summary:PDF Full Text Request
Organelle genome is an important data source for comparative genomics,phylogeny and population genetics.Compared with nuclear genome,organelle genome is smaller,single copy,mostly maternal inheritance,no gene recombination and other problems,and the sequence is highly conservative.It has unique advantages in plant phylogeny,origin and evolution research,and is the most commonly used effective information source for plant phylogeny research.With the development of sequencing technology,a large number of plants have been sequenced.The genome sequencing data contains the sequence of organelle genome,which can be used to splice the complete chloroplast genome.In this paper,the authors designed a cell-based genome sequencer and designed a user-friendly genome sequencer.Taking S.wilsonii as an example,the assembly and analysis of chloroplast genome of Salix were completed by using the built platform.Furthermore,the chloroplast genome assembly,comparison and phylogenetic analysis of 17 Salix species were carried out,and the application of the development platform was demonstrated.We further selected representative dicotyledons and discussed the use of chloroplast genome information to study the phylogeny of dicotyledons.The results are as follows.A new and effective assembly strategy was proposed and the chloroplast genomes of two plants were successfully assembled.The reads library was constructed from three generations of sequencing data.The contigs set was obtained after preliminary assembly,and the candidate genome was optimized by quickmerge.The seed contig sequence was determined using the reference conserved gene sequence.Based on the assembly feedback mechanism,the sequence can be extended many times to obtain the complete genome sequence.Finally,the assembly integrity test was realized by referring to the genomes of related species and the visualization of the results was provided.The main functions of the platform include genome structure analysis,CDs structure analysis and codon preference analysis.The analysis of chloroplast genome structure includes the analysis of genome size variation,GC content comparison and comparison of the number of coding protein genes.CDs structure analysis can extract CDs sequence,analyze coding region length variation and intron loss.Codon preference analysis mainly includes GC1s,GC2s,GC3s and GC content distribution map,neutral mapping,ENC mapping analysis.The platform provides two kinds of RSCU analysis methods:single species RSCU analysis and inter species RSCU comparative analysis,and presents the comparative analysis results in a visual way.Online data analysis tool mainly realizes four functions:SSR analysis,codon preference analysis,m VISTA annotation file format conversion and collinearity mapping analysis.Base on the platform,bioinformatics analysis method was used to complete the assembly,annotation and characteristic analysis of the chloroplast genome sequence of S.wilsonii,and the structure and composition of the chloroplast genome of S.wilsonii were clarified.The comparative genomics and phylogenetic analysis were carried out based on the existing chloroplast genome data.The results showed that the size of chloroplast genome was 155,026bp,and the genome encoded 130 coding genes.The composition and sequence of genes were highly conserved.The inf A gene might be a pseudogene.There are 18 introns in the chloroplast genome of S.wilsonii,among which 3 genes contain 2 introns.The codon preference analysis showed that the chloroplast gene of S.wilsonii preferred the codon ending with A/T base.The ENC values indicated that the weak codon bias of chloroplast genome.Both neutral mapping and ENC-plot showed that the codon bias of the chloroplast gene of S.wilsonii was greatly affected by selection,gene length,base composition and mutation.SSRs were mainly Mononucleotide repeats,which indicated that the genomic regions with high SSR density could be used as biomarkers.The results of collinearity analysis and the chloroplast genome-wide alignment showed that no large segment inversion or gene rearrangement occurred.phylogenetic trees indicating that S.wilsonii and S.tetrasperma have closer genetic relationship.The comparative analysis of the chloroplast genome of Salix showed that there were115-134 genes in the chloroplast genome of Salix.Some species lost lhb A gene,inf A gene was pseudogene in some plants,and ycf15 gene was lost in most species.Comparing the chloroplast genome sequences of Salix,we found that IR region had the lowest variation,LSC region had the highest degree of variation,coding gene variation was lower than gene spacer region,intron variation was lower than coding sequence,pi value of LSC and SSC was significantly higher than IR region,and contribution of parsimony information sites was significantly more than IR region.The IRs region contraction and expansion analysis showed that there was no gene crossing the IRa-LSC in all Salix species.The IRb-LSC boundary of 13species was located within rpl22 gene,and trn H gene was located downstream of the IRa-LSC boundary.The Ka/Ks ratio of the protein coding gene sequence of Salix ranged from 0.8933 to0.25.The codons preferred to end at A and T.SSR was mainly composed of poly A/T.Most genes had low Ka and Ks,and the acc D gene showed the greatest positive selection effect,indicating that the gene was strongly positive selection,and it was a rapidly evolving gene.Phylogenetic analysis showed that S.interior,S.tetrasperma,S.chaenomeloides and S.paraplesia were clustered into one branch.SSC,IRs,coding region and noncoding region phylogenetic analysis all support S.triandra to be separated into one branch.The phylogenetic relationships based on NCR and chloroplast genome were consistent.According to the difference of phylogenetic relationships constructed by functional protein coding gene grouping,it showed that gene function has certain influence on phylogenetic relationship.The phylogenetic tree based on the selection pressure group showed that the gene grouping on different selection pressure and negative selection supports different evolution relationships.The phylogenetic tree based on nucleotide mutation rate grouping showed that with the increase of nucleotide mutation rate,the support rate of each node increasing,and the evolutionary relationship among different gene groups was inconsistent.Phylogenetic analysis showed that the gene function,the rate of total nucleotide substitution and the natural selection pressure all affected the phylogenetic analysis.
Keywords/Search Tags:platform development, organelle genome, Salix wilsonii, chloroplast genome, evolutionary analysis
PDF Full Text Request
Related items