Font Size: a A A

Predicting Whole-Genome Protein-Protein Interactions In Sheep By Machine Learning Algorithms

Posted on:2020-07-03Degree:MasterType:Thesis
Country:ChinaCandidate:S DuanFull Text:PDF
GTID:2393330578956470Subject:Agriculture
Abstract/Summary:PDF Full Text Request
Sheep is an important animal husbandry,which has great economic importance to promote the development of animal husbandry in China.However,there are still many problems in the process of sheep production and breeding,such as nutrition,immunity and so on.The molecular mechanisms of most of these problems are unclear.Protein-protein interaction is an important way to uncover protein function,and is one of the key issues in molecular biology.However,no systematic prediction and research on protein interaction has been found in sheep.Based on the above background,this study focused on the whole protein interactions of sheep,including prediction and related database construction.Firstly,the sheep genome was predicted by machine learning.Protein annotations from 20 databases were screened as feature data for subsequent computational training.By comparing six common machine learning methods(random forest,decision tree,Bayesian classifier,Rogers regression,support vector machine and neural network),it is found that the accuracy,precision and AUC of random forest classification model were 0.8893,0.9982 and 0.9533,which were better than other classifiers.Therefore,the random forest was finally determined as the classification method used in this study.For 28592 sheep proteins,this study predicted the interaction of 820072 pairs of sheep proteins.The biological significance of the predicted results was further confirmed by co-expression and direct interolog data.Secondly,based on the predicted whole genome sheep protein interaction data,the internal test version of sheep protein interaction database was successfully built.The database mainly provided the functions of data upload and storage,network graph display,protein function information display and so on.It provides an effective database resource for further research on interaction and function of sheep proteins.This study found the best predictive performance of machine learning method,which provided a relatively complete data resource for in-depth mining of protein interaction mechanism and discovering some specific biological functions.At the same time,it can provide theoretical basis for explaining sheep breeding from the molecular point of view.
Keywords/Search Tags:Sheep, Random forest, Prediction, Protein-protein interaction, Database
PDF Full Text Request
Related items