Font Size: a A A

Identification Of Prokaryotic Cell Wall Lyases Based On Feature Selection

Posted on:2018-11-05Degree:MasterType:Thesis
Country:ChinaCandidate:X X ChenFull Text:PDF
GTID:2310330512983122Subject:Biophysics
Abstract/Summary:PDF Full Text Request
Owing to the abuse of antibiotics,the drug resistance of pathogenic bacteria becomes more and more serious.Therefore,it is urgent to develop a more reasonable and efficient method to solve this trouble.When scholars search for more effective therapeutic strategies,a great amount of effort has been placed on the study and development of bacterial cell wall lyases.They can destroy the bacterial cell structure and then kill the infectious bacterium,hence the bacterial cell wall lyases are suitable candidates of for researching antimicrobial agents,which benefits from high potency activity toward drug-resistant strains and a low inherent susceptibility to emergence of new resistance phenotypes.In the face of vast amounts of protein data,it is necessary to provide a computational method to identifying the lyases accurately and efficiently.The current work was devoted to build a predictor for recognizing the lyases.For this purpose,in this work,a series of objective protein sequences was collected rigorously via searching through the UniProt database firstly.As a result,a total of sixty-eight lyases and three hundred and seven non-lyases were obtained to form the final benchmark dataset.Whereafter,we put forward represent sample sequences by employing an improved pseudo amino acid composition which contains not only g-gap dipeptide composition,but also correlation of physical chemistry property between two residues.A feature selection technique based on the analysis of variance was applied to acquire the optimal feature subset with sixty-three features,eventually,the support vector machine was used to perform prediction.In this work,the jackknife cross-validated results showed that the optimal average accuracy of 84.82% was achieved with the overall accuracy of 90.13% and the auROC of 0.926.We developed a free predictor called Lypred at http://lin.uestc.edu.cn/server/Lypred in order to facilitate other investigators.We make sure that Lypred will be a powerful tool for the study of lyases and development of antibacterial.According to the difference in species,cell wall lyases can be segmented into two types,namely endolysins and autolysins.For the purpose of identifying endolysins and autolysins,we used sixty-eight lyases from Lypred as benchmark dataset for model construction.Among dataset,twenty-seven endolysins proteins were considered as the positive samples and the remaining forty-one autolysins sequences were deemed as the negative sample set.We chose tripeptides to formulate the protein sequence and used the binomial distribution to feature selection in order to eliminate the redundant and noisy information.Finally,the support vector machine was trained by using forty-four optimal features that have the strongest contribution to classification.The overall accuracy of the model is 94.12% with the auROC of 0.986,which demonstrate the power of the proposed model.
Keywords/Search Tags:cell wall lyases, pseudo amino acid composition, tripeptides, feature selection, the support vector machine
PDF Full Text Request
Related items