Font Size: a A A

Research On Deep Learning Based Gene Regulatory Network Inference

Posted on:2020-08-17Degree:MasterType:Thesis
Country:ChinaCandidate:G B ChenFull Text:PDF
GTID:2370330590473942Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The gene regulatory network describes the regulatory relationships between genes,RNAs,and regulators.By analyzing gene regulatory networks,the physiological processes at the genome scale could be realized.As one of the crucial parts in computational biology,the gene regulatory network inference is always a hot topic.The existing gene regulatory network inference methods can be mainly divided into two types,feature engineering and machine learning approaches and deep learning approaches.Disadvantages of feature engineering and machine learning approaches mainly include three aspects,such as inability to consider the directionality of regulatory relationships,unsuitability for largescale networks and design of input features,and the instability of feature selections.Deep learning-based approaches suffer from the problems of insufficient and high-dimension data.Because of this,deep learning approaches are not the mainstream methods for gene regulatory network inference.Aim to address the problems of existing methods,this thesis focus on data construction and model design.Based on a deep learning model and a noise estimation method,we first construct training data reasonably and then propose a novel gene regulatory network inference method.By analyzing label distributions of the mouse gene expression data from ENCODE project,in this thesis,we design a negative sampling method which is suitable for gene expression data.By combining data characteristics and biological background knowledge,we propose a Match-LSTM model based on the semantic matching framework as the baseline.In order to model the influences of different time points and cell environments,we propose the Internal-Att-Match model and the Interactive-Att-Match model by applying the attention mechanism to the Match-LSTM model.The experimental result on the mouse gene expression dataset shows that our proposed models obtain 0.831 and 0.838 in the F1 value,respectively.Compared with the Match-LSTM model,the F1 value of the Internal-Att-Match model and the Interactive-Att-Match model is increased by at least1.0%.The result shows that it is reasonable to apply the attention mechanism to model the cellular environment changes and gene interactions.Aimed at the lack of available regulatory relationships and the low credibility of the prior regulatory networks in the gene regulatory network inference,we propose a novel framework with the combined use of a class-noise estimation method and semi-supervised learning,referred to as Denoise-Semi.In order to incorporate the low-credible regulatory relationships into the training process as priori,Denoise-Semi first calculates the noise probability of the prior data through the class-noise estimator,and then keeps the highquality samples according to the noise probability.The results on RegNetwork and PriorNetwork show that the F1 value of our proposed Denoise-Semi is at least 2.0% higher than the baseline model.Through data visualization,we observe that our proposed model indeed can identify high quality samples from prior networks effectively to improve the performance of core classifiers.
Keywords/Search Tags:gene regulatory network inference, deep learning, class-noise estimation, attention, semi-supervised learning
PDF Full Text Request
Related items