Font Size: a A A

Research On Commonality Of Fungal SRNA Transboundary Regulation Mechanism Based On Machine Learning

Posted on:2022-07-31Degree:MasterType:Thesis
Country:ChinaCandidate:J X ChiFull Text:PDF
GTID:2480306329490734Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Plant fungal disease is an invasive disease with infective process and contagiousness.There are many types,accounting for about 80% of all plant diseases.Among them,Magnaporthe grisea,Phytophthora infestans and Botrytis cinerea bring serious damage to plants.Magnaporthe grisea and Phytophthora infestans have caused serious reductions in grain production.Botrytis cinerea has a wide host population,which is the primary factor limiting the production of greenhouse plants.In recent years,studies have found that pathogenic fungi and plants have sRNA(small RNA)level transboundary regulatory mechanisms.In the current studies,researchers have used biological experiments to study the commonality of a few specific fungal sRNA sequences,and biological experiments have Certain restrictions,such as too many sequences,contamination in operation,etc.Nowadays,bioinformatics is developing rapidly,and good progress has been made in the fields of data analysis and sRNA prediction models.As for whether there are many similarities between pathogenic fungi in transboundary regulation of plants,no researchers have used computer methods to carry out comprehensive and in-depth research.Therefore,the use of computer methods to build a model for predicting whether fungal sRNA is a key sequence of pathogenicity and the study of the cross-border regulation and commonality of key sRNA sequences on plants are of great significance for controlling fungal diseases,increasing grain production and income,and storing fruits and vegetables for a long time.This article is first based on 6 kinds of high-throughput non-coding data: Magnaporthe grisea sRNA,Magnaporthe grisea mixed sRNA after 72 hours of infection of rice,Botrytis cinerea sRNA,mixed sRNA after Botrytis cinerea infecting tomato 72 hours,Phytophthora infestans sRNA,mix sRNA after Phytophthora infestans infection potato 72 hours.Based on the extensive data statistical analysis,the sRNA with obvious differential expression after fungal infection compared with before infection is regarded as the key sRNA for pathogenicity.Secondly,mining and extracting features,applying KNN,Naive Bayes,Decision Tree,Random Forest,SVM,XGBoost six machine learning algorithms to construct a key sRNA prediction model for fungal transboundary regulation of plants on the data set.Comparing the models with the optimal parameters through training,the results show that all models are good.Among them,XGBoost has the highest AUC value among the three fungal models and has good effects in accuracy,recall,precision,and F1 score.The AUC values are: Magnaporthe grisea 0.8642,Botrytis cinerea 0.9404,and Phytophthora infestans 0.9445.Then,perform functional enrichment analysis on the core gene nodes of the key sRNA targeting prediction results,we obtained multiple GO(Gene Ontology)and KEGG Pathways.Finally,According to the statistics of commonality,the Molecular Function(GO)intersection of rice and tomato is 3,the KEGG Pathway intersection of rice and potato is 16;the KEGG Pathway intersection of potato and rice is 11;the KEGG Pathway intersection of potato and tomato is 15;There are 9 KEGG Pathways shared in potato,rice,and tomato.The fungal key sRNA prediction model constructed in this article is the optimal model obtained by comparing multiple performance indicators in a variety of machine learning models.This model is adapted to the three fungi in this paper,so this model can be used to predict the key sRNAs of fungi infecting plants to a certain extent,and it can promote the research of fungi infecting plants to a certain extent.At the same time,this study found that the enrichment results have a lot of overlap,which indicates that when fungal sRNA regulates plants across borders,there are many commonalities in plant functions and pathway regulation.Further analysis of the commonality of KEGG Pathway showed that it can participate in the regulation of gene expression and metabolism of plants,thereby affecting plant growth,development,reproduction and responding to the external environment.This article lays a theoretical foundation and broadens the thinking for the study of fungal sRNA transboundary regulation of plants,and points out a new direction for the prevention and control of plant fungal diseases.
Keywords/Search Tags:sRNA Data Mining, High-throughput Data, Differential Expression, Machine Learning, Cross-border Regulatory Commonality
PDF Full Text Request
Related items