Font Size: a A A

Research And Application Of Deep Learning In Plant LncRNA Recognition

Posted on:2020-05-01Degree:MasterType:Thesis
Country:ChinaCandidate:Z ChangFull Text:PDF
GTID:2370330596482451Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Non-coding RNAs longer than 200 nt are called long non-coding RNAs(lncRNA),and lncRNA is a hot topic in current research.Although it does not have the ability to encode proteins,lncRNA indirectly affects protein formation by acting on other molecules.With the development of sequencing technology,a large number of sequences have been discovered,and only by accurately identifying lncRNA can we lay the foundation for exploring its internal structure and prediction its function in the future.The research on human and animal lncRNA recognition has been relatively mature,but plant lncRNA structure is complex,and the number of samples is insufficient,so is difficult to identify.Moreover,most of the previous methods are based on feature engineering which manually extract features and cannot learn the intrinsic features of the sequence.Therefore,it is of great significance to use the deep learning method to efficiently and accurately identify plant lncRNA and predict its function.In this paper,two lncRNA recognition models lncRNA-LSTM and lncRNA-CNN are constructed based on long short-term memory network(LSTM)and convolutional neural network(CNN).Cluster undersampling operation on the negative set of samples to achieve positive and negative sample equalization.In order to enable the RNA sequence to be imported into the LSTM,the sequence is p-nts encoded,and each successive p nucleotides are encoded,so that each RNA is represented as a sequence of numbers.The RNA is then one-hot encoded and each RNA is represented as a 4*n matrix which can be imported into the CNN.The experimental training set and test set are divided into 8:2.The overall accuracy of lncRNA-LSTM and lncRNA-CNN on the test set reached 96.2% and 95.2%,respectively.In order to demonstrate the superiority of the proposed method,the comparative experiment based on feature engineering is added,the secondary structure,k-mers and other features are extracted and the support vector machine and other model is classification model.The results on the corn dataset,proposed methods show a better performance.In addition,the method proposed in this paper is better than the results of the current popular CPC2,CNCI,PLEK,LncADeep methods on the same datasets.
Keywords/Search Tags:Deep learning, lncRNA, recognition, plant, functional prediction
PDF Full Text Request
Related items