Predicting S-sulfenylation Sites Using The Machine Learning

Posted on:2019-02-04

Degree:Master

Type:Thesis

Country:China

Candidate:G C Lei

Full Text:PDF

GTID:2370330593951051

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

This paper focus on the use of computational methods to predict the protein modification sites,as well as in this regard we have done some of the situation.In the past few years,many people have done a lot of effort in predicting the post-translational modification(PTM)in proteins.The protein modification plays an important role,for example,by changing the side chain PTM(post-translational modification)of the peptide to play a very important role in many biological processes.They may affect protein subcellular localization,functional activation,turnover,and interaction with other molecules.They are also associated with many complex diseases such as Parkinson's disease,Alzheimer's disease and some cancer-related diseases.It is great significance and function to study the post-translational modification of protein.In this paper,we mainly do the research and prediction of the S-sulfenylation sites,which consider the work done by the predecessors.Since the modification of this modification site is difficult,the sample size less,in the prediction of the accuracy is not very high,so we through the following two aspects of the corresponding treatment,to obtain satisfactory results and the corresponding accuracy.(1)In our article,we first analyze the methods of feature extraction,which are expressed by predecessors for protein sequences,especially short text sequences,including the way of extracting features through physical and chemical properties.Through the binary coding method for feature extraction,through PSSM profiles to extract the feature and the method of location coding sequence of specific amino acid and so on.And the above-mentioned feature representation method is feed into the machine learning model to carry out the corresponding test and forecast,including the random forest,SVM(support vector machine),logical regression and other machine learning algorithm to predict the protein modification sites.(2)For the previously mentioned the method and the accuracy of the problem we mainly in this article from the following aspects of the corresponding improvements and innovation.First,considering the limited the amounts of samples that can be obtained by experiment of proteins,it is difficult to obtain effective information indicating the sequence in the short text,and after the feature representation,the short sequence features of repetition after presentation is larger than that of the long sequence,so the accuracy of the sample is improved by selecting the number of samples.Secondly,for the representation of the feature,the short-text sequence is represented by the physical and chemical properties of the protein amino acid and the feature selection method is used to improve the prediction accuracy.

Keywords/Search Tags:

S-sulfenylation Sites, Physicochemical Properties Difference, Machine Learning, Short Text Sequence

PDF Full Text Request

Related items

1	Research On Classification Of Protein Post-translational Modification Sites Based On Machine Learning In Imbalanced Data Set
2	Research On RNA Related Function Sites Based On Machine Learning
3	Protein And RNA Modification Sites Prediction By Using Machine Learning Method And Its Application
4	Research On The Protein Modification Sites Based On Machine Learning
5	Predicting Functional Sites Based On Support Vector Machine And Extreme Learning Machine
6	Predicting Carbonylation Sites Based On Machine Learning Methods
7	Research On Protein-ligand Binding Sites Prediction Based On Sequence Information
8	Prediction Of Protein Phosphorylation And S-sulfenylation Sites Based On CapsNet And ACNet
9	Research On Protein-protein Binding Sites Prediction Method Based On Sequence Information
10	The Prediction Of Human Dnase? Hypersensitive Sites By Using DNA Sequence Information