Font Size: a A A

The Study On Predictive Algorithm For Heat Shock Proteins And Cell Wall Lytic Enzymes

Posted on:2022-06-05Degree:MasterType:Thesis
Country:ChinaCandidate:X Y JingFull Text:PDF
GTID:2480306527991989Subject:Biophysics
Abstract/Summary:PDF Full Text Request
Heat shock proteins(HSP)are synthesized by organisms when they are stimulated by the external environment,they are involved in various cellular processes,such as molecular chaperones,regulating apoptosis,participating in the immune of the body.Studies have shown that HSP have a certain correlation with many diseases,HSP have become the target for specific drugs and plays an irreplaceable role in the treatment of diseases.As the overuse of antibiotics,people worried that the existing antibiotics will be ineffective against these pathogens with the rapid rise of antibiotic-resistant strains.The use of cell wall lytic enzymes to destroy bacteria has become a viable alternative to avoid the crisis of antimicrobial resistance.Heat shock proteins and cell wall lytic enzymes were studied in this thesis,we chose datasets about HSP and cell wall lytic enzymes from the published literature.The amino acid composition,the dipeptide composition,the split amino acid composition,the conjoint triad feature,the position-specific score matrix and the auto-covariance average chemical shift were selected as feature vector information,and support vector machine,random forest,k-nearest neighbors,naive bayes were used as prediction algorithm.In order to overcome the imbalanced data classification problems,the synthetic minority over-sampling technique was used to balance the dataset.For the cell wall lytic enzyme dataset,we used the F-score to remove the redundant information of the dipeptide composition.For heat shock protein families,the split amino acid composition,the dipeptide composition,the conjoint triad feature and the auto-covariance average chemical shift were combined to predict the HSP with support vector machine.The overall accuracy was 99.72% with balanced dataset in jackknife test,it was 4.81% higher than imbalanced dataset with the same combination feature.Compared with the existing methods in literature,the accuracy of HSP20,HSP40,HSP60,HSP70,HSP90 and HSP100 in our prediction model were 99.93%,99.72%,99.93%,99.85%,100% and 100% respectively,it was 3.57%,7.81%,3.97%,7.98%,1.57%,2.52% higher than Pred HSP,and 3.65%,3.25%,3.32%,2.33%,0.83%,0.83% higher than ir-HSP.For cell wall lytic enzymes,the amino acid composition,the dipeptide composition,the position-specific score matrix and the auto-covariance average chemical shift were combined to predict the cell wall lytic enzymes with support vector machine.The sensitivity,specificity MCC and accuracy were 99.35%,99.02%,0.98,99.19% with balanced dataset in jackknife test.Compared with the existing methods in literature,our accuracy is 18.79%,7.89%,3.69% higher than Ding et al.,Lypred and CWLy-SVM respectively.
Keywords/Search Tags:Heat shock proteins, Cell wall lytic enzymes, Combination feature, F-score, Synthetic minority over-sampling technique, Support vector machine
PDF Full Text Request
Related items