Font Size: a A A

Whole-genome Sequencing Of Cantharidin Polka Dots And Research On Genes Related To The Mechanism Of Cantharidin Synthesis

Posted on:2021-10-23Degree:MasterType:Thesis
Country:ChinaCandidate:X Q HaoFull Text:PDF
GTID:2510306041453984Subject:Aquatic biology
Abstract/Summary:PDF Full Text Request
Cantharidin,produced by Melioidae,as well as some species in Oedemeridae and Fulgoridae,has many medical usages(e.g.,cancer treatment,curing bacterial infection and inflammation,and raising white blood cell level),and is used in agriculture for pest control.With the expansion of its application scope,the demand for cantharidin has increased,and supply from the commonly used insect species are difficult to meet this demand.Meanwhile,it is difficult to synthesize cantharidin on industrial scale.To understand the biological mechanism behind and metabolic pathways associated with cantharidin synthesis in beetles,this study selected a widely distributed blister beetle species in Northern China,Mylabris aulica,to sequence its whole genome and study its cantharidin synthesis-related genes.The main findings are as follows:1.The sequencing,survey and assembly of M.aulica genome.In this study,the next generation sequencing platform Illumina X-ten was used to obtain genomic data of 13.5 G.,and the genome size was investigated using the kmerfreqhash program in GCE.K-mer analysis estimated the genome size of M.aulica to be 3 18.4 Mb.Then,the genome of M.aulica was sequenced on the Nanopore platform,and obtained 53.5G raw data.The obtained data was quality controlled using Trimmomatic.After error correction,WTBDG was used for assembling and BUSCO for evaluation.The final genome assembly is 288.5 Mb in length,the N50 is 467.8 kb,L50 is 44 and the average coverage is 101.4(30.41 Gb mappable data),and the error rate is 0.09%.Overall,the genome assembly has high continuity and low error rate.2.Annotation of the Mylabris aulica genome.We first searched and classified the repetitive elements on the genome.Then,based on the genome with repeated sequences masked out,annotation of functional genes were carried out using both de novo prediction,and homology prediction based on related species.Finally,the results of homology prediction and de novo prediction were integrated in EVM software,and 16,500 protein-coding genes were obtained.The average length is 5940.92bp and genes contain 6.62 exons on average.Subsequent functional predictions of these genes revealed that 16,444 genes could be annotated in total.16131 Genes are annotated in the GeneBank's NR database,15153 genes were predicted in the Pfam database,13445 genes were predicted in the Swiss-Prots database.Among them,15559 genes were assigned to at least one GO process,6269 genes were assigned to two or more KEGG pathways,Analysis of the annotated results revealed that genes involved in cantharidin biosynthesis account for only a small part of its coding sequences;from the results of NR annotation,we found that the similarity of the genome sequences between the M.aulica and two species that can not produce cantharidin,Tribolium castaneum and Anoplophora glabripennis,are as high as 79.85%.3.Comparative genomic analysis of M.aulica.In order to further explore the formation and evolution of genes related to the cantharidin biosynthesis,this study identified 435 single-copy orthologous genes and use it performed phylogenetic analysis.And found that M.and Hycleus phaleratus formed a single clade on the tree with very short divergent time.It is estimated that the three species diversified from a common ancestor within one million years,so many of their biochemical metabolism including cantharidin biosynthesis may still resemble the common ancestor.Further analysis of gene families' contraction and expansion revealed 377 significant expansion genes among 1,783 expanded gene families.Based on functional analysis,we found the mechanisms related to cantharidin biosynthesis were conserved in these three blister beetles.4.Analysis of genes related to cantharidin biosynthesis.In order to further analyse the cantharidin-related biosynthesis mechanism in the M.aulica genome,this study uses the KEGG database provided by the KAAS tool for functional annotation to find terpenoid biosynthetic pathways in cantharidin biosynthesis.In addition to the genes that have been reported related to cantharidin biosynthesis,we identified two new gene sequences.Function search analysis found that they are enzymes that involved in the biosynthesis of acetoacetyl-CoA and farnesal,respectively,and Pfam analysis show that they may have the function of thiolase and short chain dehydrogenase,respectively.
Keywords/Search Tags:Mylabris aulica, Cantharidin, Genome, Comparative genome, Phylogenetic analysis
PDF Full Text Request
Related items