Font Size: a A A

Research On The Method Of Predicting The Ectodomain Shedding Events Of Membrane Proteins Based On Machine Learning

Posted on:2021-05-23Degree:MasterType:Thesis
Country:ChinaCandidate:Z M XingFull Text:PDF
GTID:2404330626458911Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Membrane proteins play essential roles in modern medicine.In recent studies,some membrane proteins involved in ectodomain shedding events have been reported as the potential drug targets and biomarkers of some serious diseases.However,there are still few effective tools for predicting the ectodomain shedding events of membrane proteins.So,it is necessary to develop a tool that can effectively predict the ectodomain shedding events of membrane proteins.In this paper,we first use a traditional machine learning technique to propose a model based on support vector machine(SVM)to predict the ectodomain shedding events of membrane proteins.To ensure the fairness of the performance comparison,we use the dataset from the existing study,and use different tools for each protein to calculate the attribute characteristics of the proteins.All the protein data are presented by 34 types of feature,the feature vector of each protein in the protein dataset is composed of 1523 feature elements.Then,we use two-stage feature selection to eliminate irrelevant and redundant feature elements and select the more informative feature elements.On the first stage,we use the modified minimum redundancy maximum relevance(mRMR)feature selection method to eliminate irrelevant and redundant feature elements.On the second stage,we use the support vector machinerecursive feature elimination(SVM-RFE)feature selection method to rank these feature elements obtained on the first stage.Finally,we select 127 feature elements and use them to train a classifier based on support vector machine.Through experiments,we find that the accuracy,sensitivity and specificity are 78.10%,75.26% and 80.95% by using the model based on support vector machine,compared to 71%,75% and 67% by using the existing model.In addition,on two independent positive test datasets,the sensitivities are 89.47% and 83.98% by using the model based on support vector machine,compared to 73.68% and 65.80% by using the existing model.Experimental results verify that the performance of the model based on support vector machine for predicting the ectodomain shedding events of membrane proteins is better than the existing model.Next,we also construct a deep learning model which use a bidirectional long shortterm memory network(Bi-LSTM)and an attention mechanism to predict the ectodomain shedding events of membrane proteins.First,for each protein sequence,we use a position-specific iterated basic local alignment search tool(PSI-BLAST)to perform an alignment analysis on the Uniref50 dataset,and obtain position-specific scoring matrixes(PSSM)from the original sequences of these proteins.Then,we use the bidirectional long short-term memory network containing memory cells to obtain the long-distance relationship of the protein sequences,and use the attention mechanism to obtain sorting signals in the protein sequences.During model training,we use Dropout,L2 regularization and Bagging ensemble learning technique to reduce overfitting.Through experiments,we find that the accuracy,sensitivity and specificity are 81.19%,77.32% and 85.04% by using the deep learning model.Obviously,the performance of our proposed deep learning model is better than the existing model and the model based on support vector machine.In addition,on the test dataset,the accuracy,sensitivity and specificity are 83.14%,84.08%,and 81.63% by using our proposed deep learning model,compared to 70.20%,71.97% and 67.35% by using the existing model.Therefore,we believe that our proposed deep learning model can be used to predict the ectodomain shedding events of membrane proteins more accurately.
Keywords/Search Tags:Membrane Proteins, Ectodomain Shedding, Machine Learning, Deep Learning
PDF Full Text Request
Related items