Font Size: a A A

Identification Of Calcium-binding Residues In Proteins Based On Sequence Information

Posted on:2016-08-11Degree:MasterType:Thesis
Country:ChinaCandidate:Z JiangFull Text:PDF
GTID:2180330479496218Subject:Physical Electronics
Abstract/Summary:PDF Full Text Request
In order to carry out their functions, proteins must bind with corresponding ligands in many critical activities of life and calcium ion is such kind of important ligand. Therefore the identification of Ca2+-binding residues was an important step of protein functional research and drug design. The binding residues could be accurately detected by means of experimental method, but it is a time-consuming process. Thus, there is need to use a rapid theoretical method to identify the ligand binding residues in proteins.Ca2+-binding residues were recognized in proteins based on sequence information, our major works are as follows:(1) A new dataset of Ca2+-binding proteins was built which including 277 calcium binding protein chains with sequence identity below 30% and resolution less than 3? and that contain 1801 Ca2+-binding residues. Overlapping segments were generated by "sliding window". On the basis of calculation results of different window sizes, we determined the best window size as 17.(2)According to the biological character of Ca2+-binding residues, Ca2+-binding segments and non Ca2+-binding segments were statistical analyzed and we studied the physiochemical property of amino acid residues. Ca2+-binding residues in set 1 were recognized by using three algorithms including Increment of diversity algorithm, Matrix scoring algorithm and Support vector machine. The best recognition result were achieved by support vector machine based on Increment of diversity values, Matrix scoring values and AC values, by using five-fold cross validation the accuracy was 75.0% and MCC was 0.5(3) Ca2+-binding residues were recognized by using support vector machine based on multi-parameters in this article and we brought in the center motif which was a new parameter. In order to study the effects of the parameter combination on the performance of identification, we added the feature parameters as input vectors for support vector machine gradually. We used other three Ca2+-binding protein sets(set2,set3,set4) to examine our method further which were built by the former researchers, and we got a familiar identification result and increase tendency and all reached the best results finally. Based on the set3 and set4 we compared our method with previous research by using 10-fold cross validation and our method obtained a better identification result.(4) A web server has been built to identify the calcium binding residues in a protein,it will provide certain help for related research.
Keywords/Search Tags:Calcium binding protein, Support vector machine, Increment of diversity algorithm, Matrix scoring algorithm, Autocross covariance
PDF Full Text Request
Related items