Font Size: a A A

The Algorithm Research And Tool Development For Omics-Based Drug Target Discovery

Posted on:2024-09-22Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y X WangFull Text:PDF
GTID:1524307163477594Subject:Pharmacy
Abstract/Summary:PDF Full Text Request
Drug research and development can be roughly divided into several stages: target identification and validation,screening and optimization of lead compounds,preclinical research,and clinical trials.The identification of targets associated with a particular disease is the first step and the rapid development of omics techniques has generated a wealth of data,which can be used for target screening.Thus,omics techniques are often applied to identify new therapeutic targets.However,there are still many challenges to modern drug discovery.Firstly,many databases collect basic information about drugs and therapeutic targets but target regulators and patented agents cannot be accessed from these databases.The data access facilities are limited because no therapeutic targets or targeteddiseases are explicitly labeled.Next,RNA interactomics plays an important role in drug target discovery,and a variety of powerful computational methods have been developed to predict RNA-associated interactions.All these methods rely heavily on the‘digitalization’ of RNA-associated interacting pairs into a computer-recognizable descriptor.How to accurately represent these interactions has become a challenge for drug target discovery.Then,the key role of feature selection methods in omics research is found to be underestimated,especially the appropriate application of feature selection methods is largely ignored.The robustness of feature selection methods applied in current omics research is low,which greatly reduces the reliability of selected biomarkers,and seriously hinders the process of therapeutic target discovery.Finally,unwanted experimental or biological variation is inevitable in mass spectrometry-based metabolomics studies,and greatly increases the probability of false positives in metabolic profiling.How to apply bioinformatics to reduce false positives has become a great challenge for omics research.In view of the above four key problems,this dissertation focuses on the following four aspects:(1)In order to complement other databases,therapeutic target database(TTD,http://db.idrblab.net/ttd/)was constructed with expanded information about transcription factors and target-regulating micro RNAs,target-interacting proteins,and patented agents and their targets.(2)A novel deep learning-based representing strategy for RNA-associated interactions was proposed.First,this strategy provided a comprehensive RNA feature encoding method.Then,a self-supervised deep learning model,autoencoder,was used for feature embedding and integration of RNA and its interacting molecule.The effectiveness of this strategy was extensively validated from multiple case studies.It could significantly improve the prediction performance of RNAassociated interactions.For the convenience of researchers to utilize this strategy,a webserver RNAincoder(https://idrblab.org/rnaincoder/)was constructed.(3)The performances of 14 feature selection methods were comprehensively assessed under two criteria based on six proteomics benchmark datasets.The results demonstrated that there were significant differences in the performance of each feature selection method under two criteria.Moreover,the results highlighted the importance of choosing appropriate feature selection methods in biomarker discovery,which had a big impact on the success rate of further experimental target validation.(4)NOREVA(https://idrblab.org/noreva/)was constructed by realizing normalization & evaluation of the time-course and multiclass metabolomics,integrating 144 normalization methods of a combination strategy and identifying the well-performing methods by comprehensively assessing the largest set of normalizations to date.The capability of NOREVA in identifying well-performing normalization method(s)was extensively validated by case studies on benchmark datasets.In summary,in view of the severe challenges faced by omics technology in drug target discovery,therapeutic target regulation information from multiple omics levels and drug information from patent were added to therapeutic target database.Meanwhile,a deep learning-based strategy for effectively representing RNA-associated interactions was proposed,and an online tool was constructed.Then,a systematic assessment of various feature selection methods under two criteria was conducted to find a more robust method.Finally,in order to reduce false positives,we comprehensively evaluated the normalization methods based on a variety of metabolomics datasets using the combination strategy under multiple criteria and provided a webserver.This study provided a strong support for the therapeutic target regulatory information from the omics level and accurate molecular representation of RNA interactomics in drug target discovery.Meanwhile,it also provided a reference for researchers to choose the optimal data processing methods in proteomics and metabolomics.All in all,these above researches provided new ideas and strategies for drug target discovery,complex disease diagnosis and precision medicine.
Keywords/Search Tags:Drug target discovery, Bioinformatics, Proteomics, Metabolomics, Deep learning, Feature selection
PDF Full Text Request
Related items