Font Size: a A A

Time Evaluation Based On Maximum Common Substructure Algorithm And Construction Of Active Small Molecule Virtual Screening Platform

Posted on:2013-10-12Degree:MasterType:Thesis
Country:ChinaCandidate:J ChenFull Text:PDF
GTID:2271330434970434Subject:Bioinformatics
Abstract/Summary:PDF Full Text Request
As the preliminary work of drug discovery, to find some lead compounds with biological activity or medicinal efficacy from a large number of small molecule compounds is a very meaningful work. Small molecular products and synthetic compounds have been tens of millions. With the development of synthesis in recent years, an increasing number of compounds have been synthesized. So the traditional high-throughput screening has been unable to meet the needs of the development of new drugs. On this basis, there is an urgent need of a new computer prediction method to expand the number of lead compounds. Virtual screening is a promising way to solve such problems in the future.Algorithm based on the maximum common substructure is a very promising drug virtual screening method. Maximum common substructure (MCS) refers to the largest substructure shared by two compounds. An MCS-based similarity measure can have many advantages. First, MCS of drug structures that are structurally similar is very likely to be the key structural element related to their activity. Second, using this method, the common part of a pair of chemical structures, which is directly related to and explains the similarity score, can be easily visualized. However, Searching for MCS is a very computational intensive task, and the time complexity increases exponentially with the number of atoms in the structures of the two molecules. Therefore, between the structures of complex molecules, the calculation of MCS takes a very long time. In practical applications, there is always a trade-off between efficiency and accuracy, which also limits its application in the field of virtual screening. In this paper we perform time evaluation of this method using essential drugs defined by WHO and FDA-approved small-molecule drugs. By varying the amount of time allocated to the MCS-based virtual screening, statistical analysis is conducted to study the impact of computation time on the screening results. It is shown that the time efficiency can be improved without compromising accuracy by setting proper time thresholds. In the process, we have considered the difference of CPU speed between different machines, so this result is generalized. In addition, the similarity of compound structures and its relevance to biological activity are analyzed quantitatively, which highlight the applicability of the MCS-based methods in predicting functions of small molecules.Due to the discovery of a large number of disease-related targets and their potential therapeutic drugs, a large amount of biological information and data is accumulated. This information is mostly fragmented. The data contents are uneven and file formats lack of uniform standards. There are many problems caused by incompatibility and their relationship is complex, so it is difficult to perform in-depth data mining. With the help of information technology, we hope to establish an integrated bioinformatics and cheminformatics network platform:M&Function. Through data mining from the mass of electronic resources and papers,we integrate small molecule drug names, structures, functions, classifications and other information. Then we establish a small molecule drug information resource library. Based on the maximum common substructure-based (MCS-based) and fingerprint algorithms, we establish a system for function prediction of active small molecules. Because of its comprehensive data, embedded graphics and statistics plug-ins, and user-friendly website design, M&Function platform is intuitive, efficient and easy to use. Data test results are reliable. M&Function platform is not only a library of small molecule drug information, but also a function prediction platform. It can be used to provide information and data support for high-throughput screening of lead compounds. It is available at http://lifecenter.sgst.cn/mcs/home.do.
Keywords/Search Tags:Drug screening, MCS, Time evaluation, Drug database, Functionprediction
PDF Full Text Request
Related items