Computer-aided drug design is critical for drug discovery.Precise predictions of small-molecule pK_a,low-energy tautomers in aqueous solutions,and protein-small molecule binding affinity can significantly enhance molecular design efficiency and increase the enrichment rate in virtual screening.We developed two small molecule property prediction models based on machine learning methods and created a new protein-ligand scoring function based on linear regression.The details of the research contents are as follows:1.Accurate and fast estimation of small molecule pK_a is vital during the drug discovery process.We have developed Mol GpKa,a web server that leverages a graph-convolutional neural network model to predict pK_a.The model operates by autonomously learning chemical patterns that relate to pK_a and subsequently generating dependable predictions using those learned features.We trained the model using pK_adata from the Ch EMBL database for 1.6 million compounds,and the results demonstrate that Mol GpKa outperforms machine learning models that rely on human-engineered fingerprints.2.In computer-aided drug discovery,quickly and accurately identifying the major tautomeric state of a drug-like molecule is crucial since it determines the molecule’s pharmacophore features and physical properties.To address this challenge,we developed Mol Taut,a tool that rapidly generates favorable states of drug-like molecules in water.Mol Taut works by enumerating possible tautomeric states using transformation rules,ranking the tautomers based on their relative internal and solvation energies calculated by AI-based models,and generating preferred ionization states using predicted microscopic pK_a values.Our tests show that the AI-based tautomer scoring approach performs comparably to the DFT method(w B97X/6-31G*//M062X/6-31G*/SMD)from which the AI models were trained.Furthermore,Mol Taut effectively predicts the substitution effect on tautomeric equilibrium,making it a useful tool for computer-aided ligand design.3.The protein–ligand scoring function plays an important role in computer-aided drug discovery and is heavily used in virtual screening and lead optimization.In this study,we developed an empirical protein-ligand scoring function that includes amino acid-specific interaction components for hydrogen bond,van der Waals,and electrostatic interaction.Moreover,our scoring function also includes hydrophobic,π-stacking,π-cation,and metal-ligand interactions.Our extensive testing indicates that the AA-Score outperforms other widely used traditional scoring functions regarding scoring,docking,and ranking.The methods employed in this study have demonstrated superior accuracy in predicting small molecule properties and reduced computation time when compared to previous research.These results provide valuable insights for future studies focused on property prediction and binding free energy calculation for small molecules. |