Font Size: a A A

Molecular Identification Of DNA Barcoding Of Three Medicinal Plants Based On Machine Learning Approaches

Posted on:2022-04-07Degree:MasterType:Thesis
Country:ChinaCandidate:T T FengFull Text:PDF
GTID:2504306554460244Subject:Pharmacy
Abstract/Summary:PDF Full Text Request
Gouteng,agarwood and maca have high medicinal value,but the phenomena of fake,shoddiness and inferior are common in the market,which may result in reduced drug efficacy and potential medical safety risks.DNA barcoding is an efficient,convenient,rapid,accurate,and reliable method for species identification and adulteration detection.Barcode sequences analysis is the most important part of DNA barcoding.The selection of DNA barcoding analytical methods affects the selection of the suitable barcode for species identification.In recent years,machine learning approaches have been used in DNA barcoding species identification analysis and compared with traditional analytical methods.The results showed that machine learning approaches were better than traditional analytical methods,and thus can be used as an effective tool for DNA barcoding species identification analysis.In this study,machine learning approaches(BLOG,SMO,Na(?)ve Bayes,Jrip and J48)were applied to species identification for three medicine plants based on different DNA barcoding(ITS,ITS2,mat K,psb A-trn H,rbc L and trn Ltrn F).The machine learning approaches results were compared with traditional analytical methods of distance-based(Taxon DNA)and treebased(NJ tree),which aimed to improve the identification success rates of Uncaria,Aquilaria,maca and its adulterants,as well as to find a better barcode and analytical methods.Conclusions are as follows:(1)DNA barcoding identification of Uncaria based on machine learning approaches: Six single barcodes(ITS,ITS2,mat K,psb A-trn H,rbc L and trn L-trn F)and their combinations were used to analyze a total of513 sequences included 15 Uncaria species based on machine learning approaches(BLOG,SMO,Na(?)ve Bayes,Jrip and J48),Taxon DNA and NJ tree methods.Single barcode ITS and ITS2 were proposed as the suitable barcode for the authentication of Uncaria species,as they provided 100%success rate in identifying 11 Uncaria species.Machine learning approaches could improve the efficiency of DNA barcode for identifying Uncaria species: BLOG/SMO could identify 11 Uncaria species with single barcode ITS and/or ITS2,while NJ tree required two-locus combination of ITS + rbc L to reach the same species resolution rate,and the Taxon DNA could only identify 10 Uncaria species.(2)DNA barcoding identification of Aquilaria based on machine learning approaches: The identification efficiency of five single barcodes(ITS,mat K,psb A-trn H,rbc L and trn L-trn F)and their combinations for 11 Aquilaria species listed in IUCN Red List of Threatened Species was evaluated by comparing machine learning approaches with distance-and tree-based methods.The results showed that two-locus combinations ITS+ mat K,ITS + rbc L and ITS + trn L-trn F could accurately identify six Aquilaria species with fewest barcodes.Machine learning approaches could broaden the range of preferred barcodes for the Aquilaria identification: ITS + mat K,ITS + rbc L and ITS + trn L-trn F could accurately identify the six Aquilaria species when BLOG/SMO was used,whleast only ITS + mat K could identify these six species when using Taxon DNA and NJ tree.Furthermore,the machine learning approaches exhibited greater species resolution than other analytical methods in dealing with few variable sites and more conserved sequences(such as chloroplast barcode psb A-trn H)and chloroplast-locus combinations.(3)DNA barcoding identification of Lepidium meyenii(maca)and its adulterants and machine learning analysis: BLOG,Taxon DNA and NJ tree methods were used to evaluate the discrimination ability of five DNA barcodes(ITS,mat K,psb A-trn H,rbc L and trn L-trn F)for maca and its two common adulterants,Brassica rapa(turnip)and Raphanus sativus(radish).The results showed that the five barcodes could accurately distinguish between maca,turnip and radish,regardless of the analytical methods used.Specifically,BLOG generated logical formulas for identifying maca,turnip and radish accurately.Out of the five barcodes,the psb A-trn H and trn L-trn F of maca and turnip/radish had a large difference in sequence length due to a large number of indels,which made it possible for these barcodes to be the ideal barcodes for the identification of maca and its adulterants.Thus,the turnip/radish was mixed into maca samples in different proportions to test the accuracy,reliability and sensitivity of psb A-trn H and trn L-trn F.The results indicated that psb A-trn H and trn Ltrn F could detect adulterants in maca without sequencing,as agarose gel electrophoresis showed the difference in band position between species differences.In particular,trn L-trn F could detect turnip/radish adulteration as low as 0.5% level,whereas psb A-trn H can detect only radish adulteration at 7% level.Therefore,trn L-trn F was the ideal barcode for identifying maca and its adulterated products.The deployment of machine learning approach with DNA barcoding analysis of medicinal plants had achieved accurate and rapid identification at the “species” level.The results of this study demonstrated that machine learning approach was a reliable and efficient tool for the DNA barcoding of medicinal plants,which provided scientific and technical supports for combating medicinal plants adulteration,protecting endangered species and ensuring the medication safety.
Keywords/Search Tags:DNA barcoding, machine learning, Uncaria, Aquilaria, Lepidium meyenii
PDF Full Text Request
Related items