Font Size: a A A

Design And Implementation Of Compound Retro Synthesis System Based On Deep Learning

Posted on:2021-03-29Degree:MasterType:Thesis
Country:ChinaCandidate:S H GuoFull Text:PDF
GTID:2381330620964207Subject:Engineering
Abstract/Summary:PDF Full Text Request
Retrosynthesis analysis systems have played an important role in many fields such as drug design and material applications.Since the mid-20th century,more and more researchers have invested in research in this field.In recent years,the rapid development of deep learning has brought milestone improvements to many fields,among which the graph neural network for Euclidean data processing has made great progress since it was first proposed in 2009.Chemical molecules are typical graph structure data.The previous retrosynthesis analysis system is either a rule-based expert system or a traditional neural network model.This article attempts to process chemical molecules based on graph neural networks and combined with recently released open-source processing tools for chemical molecules.At the same time,inspired by the sensational AlphaGo,the Monte Carlo tree search technology was used to find an inverse decomposition path that meets the needs in the huge decomposition tree space,and reached a balance between computing resource overhead and search effect.This thesis finally realized the overall system and tested the two parts of methods.The main research contents are as follows.Design and implement a single-step inverse decomposition method,which is the ba-sic part of the entire retrosynthesis system.This method takes graph neural network as the core,inputs a target molecule,and finally obtains a list of reaction templates appli-cable to the target molecule.This method is still based on chemical rules,but rules are no longer manually coded but are automatically extracted by open source tools based on reactions that have matched atomic numbers.Molecules obtained by performing a reverse decomposition based on a rule-based model can better avoid the occurrence of ”wrong”molecules that do not conform to chemical laws.At the same time,using graph neural networks can avoid time-consuming sub-graph matching operations,so that the reaction template operation for finding target molecules can be completed in about 1 second,which greatly improves the search efficiency.The path building method is designed and implemented.This method is based on Monte Carlo tree search.It implements the four main stages,and customizes the incentive return function to guide the system to search for paths that fall in the raw material library.Finally,the above two methods are tested systematically,and a certain analysis and explanation of the possible causes of related defects are given according to the experimen-tal results.In the test of the single-step inverse decomposition method,the precision of80% and the recall of 65% were achieved,indicating that the method basically reached the usable effect.In the path building method,the average raw material hit rate of the molecules obtained in the final step of the reverse synthesis path reached 61.8%.From the overall test results,the system basically achieved the expected design goals and reached preliminary usable results.The overall structure of the system has certain flexibility,and for some more specific requirements,certain extensions and modifications can be made for related parts.
Keywords/Search Tags:Retrosynthesis analysis, Graph neural network, Monte Carlo tree search
PDF Full Text Request
Related items