Font Size: a A A

Study On The Structure-activity Relationship Of HIV-1 Non-nucleoside Reverse Transcriptase Inhibitors And Protease Inhibitors

Posted on:2021-02-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y J TianFull Text:PDF
GTID:2404330605975970Subject:Pharmacy
Abstract/Summary:PDF Full Text Request
As a worldwide disease,AIDS threatens the health of all mankind.HIV-1 non-nucleoside reverse transcriptase inhibitors?NNRTIs?and HIV-1 protease inhibitors?PIs?have made important contributions to highly effective antiretroviral therapy?HAART?.In this thesis,based on the HIV-1 reverse transcriptase?RT?and HIV-1 protease?PR?,we did study on the structure-activity relationship of HIV-1 NNRTIs,virtual screening on newly designed candidates of HIV-1 NNRTIs and study on the quantitative structure-activity relationship?QSAR?of HIV-1 PIs.?1?For the study on structure-activity relationship of HIV-1 NNRTIs:we constructed a data set containing 1267 NNRTIs with their biological activity values?IC50?.Based on the data sets random splitting three times,four types of descriptors,including Avalon fingerprints?ECFP4 fingerprints?Topological-Torsion?TT?fingerprints and CORINA descriptors were calculated for characterizing the NNRTIs.Four machine learning algorithms,including support vector machine?SVM??decision tree?DT??random forests?RF?and deep neural network?DNN?were implemented for modeling.As a result,we established 48 bioactivity classification models of the HIV-1 NNRTIs.The optimal model Model 2J,constructed with ECFP4 fingerprints and DNN algorithm,the accuracy?Q?of 0.999 and 0.871,the Matthews correlation coefficient?MCC?of 1 and 0.74.By analyzing the key descriptors in the optimal Model 2J,it was found that the substructures represented by ECFP4145,ECFP4900,ECFP4 141,ECFP4 925 and ECFP4 24 were commonly shown in highly active inhibitors.In addition,we divided 1267 NNRTIs into nine subsets with the t-distribution neighborhood embedding?t-SNE?algorithm and k-means clustering algorithm.For the further analysis on the activity distribution of the inhibitors in these nine subsets,it was found that the diaromatic pyrimidine,thiaceprazole/triazole and benzyl pyrimidone appear frequently in the highly active NNRTIs.?2?For the virtual screening on newly designed candidates of HIV-1 NNRTIs:we generated 2817 new molecules by branch chain substitution based on diaromatic pyrimidine,thiaceprazole/triazole and benzyl pyrimidone.These new molecules were classified by bioactivity classification models,the new molecules which were predicted to be highly active in all models would be reserved.Then Lipinski's five rules,filtering of the false positive compounds?PAINS?,the evaluation of pharmacokinetic?ADMET?,semi-flexible molecular docking,flexible molecular docking and patent screening were implemented for the further screening.As a result,two candidate molecules showed better performances after all these screening.These two candidate molecules are potential to be lead compounds of HIV-1 NNRTIs.?3?For the study on the quantitative structure-activity relationship?QSAR?of HIV-1 PIs,14 QSAR models on 1238 PIs were built by four machine learning methods,including multiple linear regression?MLR?,support vector machine?SVM?,random forest?RF?and deep neural networks?DNN?.For the best model Model2G constructed by DNN algorithm,the coefficient of determination?R2?of 0.88 and 0.79,the root mean squared error?RMSE?of 0.39 and 0.51 were obtained on training set and test set,respectively.For model Model2G,the applicability domain threshold?ADT?of 1.765 was obtained for training set,a compound that has a similarity distance?d?less than the ADT is considered to be inside the applicability domain,could be predicted accurately,and thus 65.37%compounds in test set performed reliable.In addition,the 1238 PIs were manually divided into eight subsets containing different scaffolds.It was found that hydroxylamine derivatives and seven-member cyclic urea derivatives showed highly inhibitory activity comparing with other subsets.We also built QSAR models with SVM,RF and DNN methods on two subsets of 299 hydroxylamine derivatives inhibitors and 377 seven-member cyclic urea derivatives inhibitors.For the best model Model3A on hydroxylamine derivatives inhibitors,R2 of 0.71 and RMSE of 0.53 were obtained for test set.For the best model Model4B on seven-member cyclic urea derivatives inhibitors,R2 of 0.82 and RMSE of 0.51 were obtained for test set.At last,we analyzed the descriptors which make significant contributions on the bioactivity of inhibitors among these two subsets.It was found that highly active inhibitors of seven-member cyclic urea derivatives usually contained several aromatic nitrogen heterocyclic ring substituents such as the inidazole and the pyrazole.The oxazolidinone group and sulfanilamide mainly appeared in highly active inhibitors of hydroxylamine derivatives.These observations may be utilized further in designing promising HIV-1 protease inhibitors.In this thesis,HIV-1 NNRTIS and PIs are systematically studied by simulation calculation,and a series of high-performance models are constructed by using machine learning algorithms.Via analyzing modeling data,model results and important descriptors,we acquaired the structural characteristics of highly active inhibitors.These conclusions would contribute to the further research on HIV-1 reverse transcriptase inhibitors and protease inhibitors.
Keywords/Search Tags:HIV-1 reverse transcriptase inhibitors(RTIs), HIV-1 protease inhibitors(PIs), Support Vector Machine(SVM), Random Forest(RF), Deep Neural Networks(DNN)
PDF Full Text Request
Related items