Font Size: a A A

Prediction Of Protein-protein Interactions Based On Wavelet Transform And Ensemble Learning

Posted on:2018-01-07Degree:MasterType:Thesis
Country:ChinaCandidate:J LuoFull Text:PDF
GTID:2310330536961911Subject:Chemical engineering
Abstract/Summary:PDF Full Text Request
Proteins play the most critical role of the living body's activities in the daily life.Most of physiological functions in the cells need to be achieved by the proteins produced and the combination of proteins.The realization of the protein function is mainly through the binding of proteins.The traditional methods of studying protein binding to each other have gained large amount of information on various types of protein-related information and interaction information,but most of these methods have their limitations,especially the test speed cannot meet the requirements of further studies.In recent years,researchers have used the machine learning tool combined with the protein characteristic coding algorithm to predict the mutual binding network of proteins,and have put forward the method to improve the prediction accuracy.However,after experiment,we find that most of the prediction methods cannot achieve ideal predict results when meet with the more stringent data set.In this paper,the protein interaction data of Helicobacter pylori,yeast and Arabidopsis thaliana were used to study the protein interaction in the same species and the migration of different species in different species by using a variety of machine learning methods in combination with wavelet transform.Protein interactions.The main contents are as follows:In the first set of experiments,the positive and negative data sets of Helicobacter pylori,yeast and Arabidopsis protein interaction were first used in the DIP protein database,and then the protein primary structure was transformed by the wavelet transform method to control the protein And the logistic regression algorithm combined with gradient boosting classifier,extra trees classifier,k-neighbors classifier,quadratic discriminant analysis and logistic regression algorithm were used to predict the protein interaction of Helicobacter pylori and yeast.Finally,the prediction results Analyzed.The experimental results show that the new set algorithm reflects the good performance and is stable on different data sets,which is a good algorithm worth further development.In the second set of experiments,we tested the new algorithm with the Tradaboost algorithm,which was widely accepted by the academy,in the self-made protein migration database.From the test results,our new algorithm also achieved a better ability to predict the data.However,when compare multiple algorithms' predict capability,we found that the interaction rules between species are not in a same way,therefore,the current sequence-based protein interaction prediction cannot be employed when learning datasets and predicting datasets are assembled from different species.
Keywords/Search Tags:Discrete Wavelet Transform, Stacked Generalization, Transfer Learning, Protein-Protein interaction
PDF Full Text Request
Related items