Font Size: a A A

Identification Of Ion Ligand Binding Residues Based On Optimized Feature

Posted on:2021-04-01Degree:MasterType:Thesis
Country:ChinaCandidate:L LiuFull Text:PDF
GTID:2370330614960645Subject:Mathematics
Abstract/Summary:PDF Full Text Request
Protein is essential substance in life activities and plays an irreplaceable role in different life processes.Since most proteins need to binding with ion ligands to perform their functions,accurate prediction of protein-ion ligand binding residues is particularly important for annotation of protein functions.The experimental method is time-consuming,laborious and expensive,so using theoretical methods to identify the ion ligand-binding residues has become a development direction of mathematical biology.Since most proteins only have amino acid sequence information,it is more general to recognize the binding residues of ion ligands by using sequence-based theoretical prediction method.The main works in this paper are as follows:?1?We constructed a dataset of 4 acid radical ion ligand-binding residues,including NO2-,CO32-,SO42-and PO43-ligands.The amino acid,polarization charge,hydrophilic-hydrophobic,relative solvent accessibility and secondary structure were selected as basic characteristic parameters.At the optimal window length,we extracted?1?composition feature;?2?2L-dimensional position conservation feature;?3?20-dimensional refinement feature?10-dimensional discrete increment values and 10-dimensional matrix scoring values?from basic feature parameters,inputted them into k-nearest neighbors classifier and used optimal k value to identify 4 acid radical ion ligand-binding residues.The prediction results of 5-fold cross-validation obtained by 20-dimensional refinement feature were the best.To test the practicability of the optimal prediction model,we performed an independent test and obtained good prediction results.We selected dataset of 10 metal ion ligand-binding residues optimal prediction model to identify the binding residues of 10 metal ion ligands,and good prediction results were obtained,which further verified the practicability of the optimal prediction model.?2?The binding residues of above 14 ion ligands were selected as the research objects.The dihedral angle was introduced as the basic characteristic parameter,the phi/psi angle corresponding to the amino acid on the protein chains was statistically analyzed,and phi/psi angle was classified according to the difference of its distribution in the binding and non-binding residues,then their composition and 2L-dimensional position conservation features were extracted.Besides,we proposed a new method for extracting information of polarization charge and hydrophilic-hydrophobic,namely information entropy.The composition and 2L-dimensional position conservation information of amino acids,secondary structure,relative solvent accessibility,dihedral angle,as well as information entropy of polarization charge and hydrophilic-hydrophobic were extracted as optimized feature,then we inputted it into the Random Forest algorithm to identify binding residues of 14 ion ligands,and got better prediction results than previous works in the 5-fold cross-validation and the independent test.By combining the classification of phi angle and psi angle,the prediction model of each ion ligand-binding residues was optimized.
Keywords/Search Tags:Ion ligand, Binding residues, Optimized feature, Information entropy, Dihedral angle, K-nearest neighbors classifier, Random Forest algorithm
PDF Full Text Request
Related items