Font Size: a A A

Cross-species Analysis Of Gene Expression Profiling For Cold And Hot Propertied Chinese Herbal Medicines

Posted on:2020-10-21Degree:DoctorType:Dissertation
Country:ChinaCandidate:A R LiFull Text:PDF
GTID:1480306038471454Subject:Pharmacy
Abstract/Summary:PDF Full Text Request
ObjecticeAs the foundation of the theoretical system of Chinese herbal medicine(HM),property theory is the most important academic characteristic of HMs.It is not only used to guide the rational application of traditional Chinese medicine,but also the core which is different from botanical and natural drugs.At present,the property of HMs is an indispensable part in Chinese medicine research.The "Four Characters" is core contents of the theoretical system of HMs,and the cold/hot properties of HMs is the basis of the "Four Characters".Using modern scientific methods and technological to study the cold/hot properties of HMs,we can excavate its deeper significance,connotation and essence.With the advent of high-throughput RNA-Seq,there has been a concerted effort on generating whole transcriptome of plant species.However,to date,it has not been reported on the research of the cold or hot properties of HMs from the perspective of plant gene expression profile based on RNA-Seq.In this paper,a method for cross-species comparison of plant is used to study the cold or hot properties of HMs,which provides a new perspective.In our study,we used 20 publicly available plant RNA-Seq data(HMs with the cold and hot properties)aggregated from 20 papers to identify the differences between cold and hot HMs.We created a method for cross-species gene expression analysis to compare RNA-Seq data between 20 different species,and mined the difference expression genes.Finally,functional enrichment analysis with GO was performed to further identified functional differences between hot-enriched and cold-enriched genes.Methods(1)Selection of HMs with cold and hot propertiesWe searched the databases MEDLINE(PubMed),Embase and Google Scholar for articles published from inception up to 1 January,2018.The findings of two search-term groups were combined:the items "traditional Chinese medicine","Chinese herbal medicine","Chinese herb","herbal medicine" were used for the first group;" transcriptome"," RNA-Seq" "high throughput sequencing"were used for the second group.After exclusion based on title and abstract,we identified 237 articles related to herbal medicine and transcriptome.The associated information,such as the TCM properties of each HM were collected and listed in a basic database.(2)RNA-Seq data acquisition and de novo assemblyThe publicly available RNA-Seq data sets were downloaded according to SRR-ID recorded in our 20 literature.Then,all reads were processed through a quality check and trimming pipeline using FastQC and Trimmomatic respectively to remove residual adapters,low-quality sequences,and reads below 36 bp.The remaining high-quality reads were de novo assembled into candidate unigenes using the Trinity program.We referred the expressed genes as those with greater than one TPM for the remaining analyses and described the expression features of transcripts in the following sections.(3)Cross-species gene expression analysisIn BUSCO,We believe that genes classified as'Complete and single-copy'in species were mapped to hidden Markov model(HMM)profiles from amino acid alignments(31 plants,including Arabidopsis thaliana).Criteria for evaluating gene classify a match as orthologous or not and as complete or not in each species are based on plant phylogenetic balance,so we can refer to it as a criterion in our study.Specifically,we selected complete,found in all species in single-copy genes using BUSCO;Then,assembled sequences of each species were mapping to reference genome(Arabidopsis thaliana)using BLASTP program without parameter setting.The maximum value of each sample's '-log(e-value)'(minimum standard)in BLASTP of 'complete,found in all species in single-copy genes' obtained in BUSCO was used as a mapping parameter in subsequent analysis.(4)Enrichment analysisAnalysis of the resulting gene list was performed using DAVID.Functional enrichment analysis with GO was performed to identify which specific genes were signifcantly enriched in the GO terms and metabolic pathways.We compared these specific genes with the whole-genome(Arabidopsis thaliana)background.Results(1)Selection of HMs with cold and hot properties113 HMs were screened in total,including 32 hot-properties,58 cold-properties and 23 neutral-properties.The majority(81%)of the sequencing platform in collected articles was Illumina,and we filtered datasets only from Illumina for further study.In order to achieve even tissue distribution of the two categories of HMs in further study,10 cold and 10 hot HMs were enrolled.(2)De novo transcriptome assemblyAfter the removal of adaptor,sequences,ambiguous reads,and low-quality reads,the percentage of Q30 of each sample reached more than 86.43%,and the GC content was 42-48%.We brought all the clean reads together and assembled them de novo using Trinity.The length distribution of contigs and unigenes illustrated that the results of the assembly were favorable and applicable for subsequent studies.(3)Functional enrichment analysis and comparison between HMs among tissuesThe match results of the 20 species was shown the difference was not significant.Next,20 sample were divided into four groups based on tissue.And we explored genes that were expressed in only one tissue.Leaves has the highest percentage of such tissue-specific genes despite the smallest size of transcriptome.We performed GO enrichment analyses on each set of tissue-specific genes and found enriched GO terms in several tissues.(4)Functional enrichment analysis and comparison between HMs with cold and hot propertiesAccording to phylogenetic tree,five of the hot properties of HMs are dicotyledons and five are non-dicotyledons.To rule out the possibility that the difference of specific genes may be from evolutionary distance between different species,we first analyzed the specific gene between dicotyledons and non-dicotyledons.The results showed that there was no significant difference between the two groups(mapping rates:17.5%for dicotyledons and 16.0%for non-dicotyledons).Next,we performed enrichment analysis and comparison between HMs with cold and hot properties in the Gene Ontology(GO)database based on the DAVID tool.Based upon the number of genes assigned,carbohydrate metabolic process,polysaccharide metabolic process,plant-type cell wall,carboxylic ester hydrolase activity and carbohydrate catabolic process are the dominant represented terms operating in cold HMs;for the HMs with hot properties,protein folding,response to oxidative stress,sulfur compound metabolic process,protein disulfide oxidoreductase activity and disulfide oxidoreductase activity were overrepresented.ConclusionHere,we focused on distinguishing raw materials of HMs between cold and hot properties by applying the method of bioinformatics approach.Using 20 publicly available plant RNA-Seq data(HMs with the cold and hot properties)aggregated from 20 papers,we created a method for cross-species gene expression analysis to compare RNA-Seq data between 20 different species,and mined the difference expression genes.Our analyses reveal that carbohydrate metabolic process,polysaccharide metabolic process,plant-type cell wall,carboxylic ester hydrolase activity and carbohydrate catabolic process are the dominant represented terms operating in cold HMs.While for the HMs with hot properties,protein folding,response to oxidative stress,sulfur compound metabolic process,protein disulfide oxidoreductase activity and disulfide oxidoreductase activity were overrepresented.Our research gives a new insight into the HM property from the standpoint of transcriptome,and provides a new perspective for cross-species comparison of plants among far ranged species.
Keywords/Search Tags:tradition Chinese medicine, property, transcriptome, RNA-Seq, cross-species
PDF Full Text Request
Related items