Font Size: a A A

Research And Implementation Of Deep Learning-based Prediction Of Super-enhancer-promoter Relationship

Posted on:2022-02-27Degree:MasterType:Thesis
Country:ChinaCandidate:J LiFull Text:PDF
GTID:2510306320468324Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In eukaryotic cells,gene regulation and precise expression play a key role in biological activities.Studying the relationship between enhancer promoter interactions(Epis)can help people understand the relationship between gene regulation,reveal the genes related to diseases,and provide new ideas and methods for disease diagnosis and treatment.Traditional bioassay methods are expensive,time-consuming and limited by resolution,so it is difficult to accurately identify a single EPIs.Computational methods to solve biological problems have become a research hotspot in recent years.In recent years,some people have tried to use deep learning algorithm to predict EPIs,and achieved good results.This kind of method is to actively learn sequence features and spatial structure through complex network structure,and then accurately predict EPIs.Super enhancer is a special type of enhancer,which is composed of enhancers with the same activity and acts on the promoter of target switch through transcription factors.So the interaction between the super enhancer and the promoter.Therefore,in this paper,we will use the common enhancer and promoter construction method to construct SEPIs data set,and use the recently popular deep learning method in Epis prediction to test the prediction of SEPIs.The innovations are as follows:1.In this paper,we used chromatin characteristics to construct SEPIs data in six cell lines.The data set constructed by this method can express more accurate activity in specific cell lines than that constructed by the traditional method of proximal gene connection.2.The method of data enhancement is proposed.The unbalanced SEPIs data set whose negative set is about 20 times of the positive set is enhanced to make the positive set expand 20 times,so as to get a balanced data set.The data set is trained and tested to verify the feasibility of deep learning algorithm in SEPIs.3.At present,people can only use high-throughput experimental methods to detect SEPIs.However,the cost of this method is too high and the time-consuming is too long,which has certain limitations.Deep learning algorithm can connect certain potential features,but it is difficult to identify the related fields only by naked eyes.This paper will make the emerging deep learning methods in computer field predict SEPIs,The data sequence information of super enhancer and promoter is sent to three-layer convolution layer for feature extraction and feature fusion,and the data is predicted by sigmord function.The deep learning algorithm proposed in this paper can realize the prediction of chromatin relationship by using common convolution layer and DNA sequence information.On the unbalanced data set,the evaluation indexes of experimental results,AUROC and AUPR,are 0.92-0.95 and 0.91-0.94 respectively,However,due to the small amount of SEPIs data,it is highly unbalanced,so there may be the effect that the model focuses on learning negative set in training;Although the evaluation index is 0.6 after data balance and data enhancement,it can be proved that deep learning can learn the eigenvalues of sequence information in SEPIs prediction;At the same time,we found that when using more data,that is,the data of all cell lines for training,the results will be improved by about 0.001,indicating that more data can make the model training better.
Keywords/Search Tags:Enhancer, Super-enhancer, Promoter, Convolutional Neural Network, Deep Learning
PDF Full Text Request
Related items