Compound-protein Interaction Prediction Based On Molecular Images

Posted on:2023-02-06

Degree:Master

Type:Thesis

Country:China

Candidate:L H Yu

Full Text:PDF

GTID:2544307097979009

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

The study of compound-protein interactions is of great importance throughout the process of drug discovery.Traditional methods for exploring compound-protein interactions tend to take time-consuming and resource-intensive biological experiments into account,while the proposed models in silico have overcome those to a certain extent due to the rapid development of computer technology recently.Most of these previous prediction methods pay more attention to extracting compound features from graph-based molecular representations.In contrast,the image-based representations of molecules,containing more information on the molecular structure,are rarely reported for compound-protein interaction prediction tasks.Meanwhile,there appears a flurry of superb work in the realm of computer vision,and they lay a solid technical foundation for molecular image feature extraction.Therefore,this paper aims at investigating the feasibility and effectiveness of molecular images in solving compound-protein interaction prediction tasks by combining the techniques of computer vision,multimodal learning,and multi-task unsupervised pre-training.In view of the aforementioned points,the main work and contributions of this thesis are as follows:Firstly,we proposed Image CPI with molecular images.Image CPI is a multimodalbased framework for predicting compound-protein interactions.The model takes molecular images and amino acid sequences as the inputs,incorporating the visual and textual modalities.To bridge the ‘semantic gap’ due to heterogeneous modalities,Image CPI employs a CNN-GRU network to extract features of images and blends them through an attention mechanism.The existing experiments demonstrate the superiority of our model compared with several state-of-the-art precedents.In order to alleviate the “black box” nature of the model and to explore the interpretability of Image CPI,visual experiments are conducted on the prediction results of Image CPI that confirm the ability of capturing compound-protein binding sites.Secondly,to enhance the prediction performance of Image CPI and alleviate the problem of lacking labeled samples in drug discovery,a well-designed unsupervised pre-training model,Mol IMG,is proposed with a framework of multi-task learning,which learns both “molecular similarity” and “molecular legitimacy” tasks simultaneously through hard parameter sharing.The model has been trained intensively and extensively on 2 million unlabeled molecular images,leading to improving the convergence of downstream tasks and alleviating the overfitting problem.Experiments on different downstream tasks show that Mol IMG strengthens the prediction ability of Image CPI as well as improves the performance of several molecular property prediction tasks.In addition,this paper constructs the “Anti-SARS-Co V-2 Drug”dataset which suffers from “lacking labeled samples” and “unbalanced data” problems.As regards to the result,Mol IMG is able to eliminate the drawbacks of the dataset to a certain extent,further,it can predict antiviral drugs beyond the training samples.

Keywords/Search Tags:

Compound-protein interaction, Multimodal learning, Multi-task learning, Unsupervised pre-training

PDF Full Text Request

Related items

1	Breast Mass Detection From The Digitized Mammograms Based On Machine Learning
2	The Research Of Computer-aided Diagnosis In Chest Images Based On Multi-semantic Task And Multi-label Incremental Learning
3	Research On Multi-task Deep Transfer Learning Algorithm For Virtual Ligand Screening
4	Named Entity Recognition Of Online Medical Consulting Texts Based On Deep Learning
5	Compound-Protein Interaction Prediction Based On Deep Learning
6	Multimodal Learning With Application To Medical Data Analysis
7	Cervical Cell Image Segmentation Method Based On Multi-Task Learning
8	Research On Sparse Multi-task Learning Algorithm Of Alzheimer’s Disease Prediction
9	Research And Application Of Medical Entity Extraction Based On Multi-task Learning And Transfer Learning
10	MRI Reconstruction And Segmentation Based On Multi-task Learning