| Circular RNAs(circ RNAs)bind to RNA binding protein(RBPs)to play an important role in many biological processes.Therefore,it is crucial to study the binding sites of RBPs on circ RNAs.Although traditional machine learning and deep learning methods have been used to predict the interactions between circ RNAs and RBPs,the existing algorithms have not been able to fully learn the relevant features of circ RNAs or to perform efficient collaborative learning.To address the above issues,two methods for the identification of circ RNA-RBP binding sites are proposed in this paper.The main work is as follows.1)The first work investigates a multi-view classification method DMSK based on multi-view deep learning,subspace learning and multi-view classifier for the identification of circ RNA-RBP interaction sites.In the existing prediction methods,circ RNA sequences have been mainly studied as the main target.And both the structural and compositional information of circ RNA sequences have not been fully exploited.Some methods have extracted different views to construct recognition models.However,how to efficiently use multi-view data for recognition model construction has not been thoroughly investigated.In the DMSK method,first,the circ RNA sequences are converted into pseudo-amino acid sequences and pseudo-dipeptide components,which are used to extract the high-dimensional sequence features and component features of circ RNA,respectively.Then,the structure prediction method RNAfold is used to predict the secondary structure of RNA sequences.The sequence embedding model is used to extract context-dependent features.Next,the initial feature data of the four views are transmitted to a hybrid network consisting of convolutional neural network(CNN)and long short-term memory network(LSTM)to obtain the multi-view deep features of circ RNAs.Further,based on the view-weighted Generalized Canonical Correlation Analysis(WGCCA),the common features of the four viewpoints are extracted by subspace learning.Finally,the learned subspace common features and multi-view deep features are used to give downstream learning of a multi-view TSK fuzzy system classifier to construct a rule-based multi-view classifier with better interpretation.Using the trained classifier,the specific positions of RBP binding sites on circ RNAs can be predicted for new samples.Our experimental study shows that the prediction performance of DMSK is greatly improved compared with existing methods.2)The second work investigates a new method LGMK,which integrates local and global feature learning.For local features,new pseudo-amino acid sequences,RNA sequences and their secondary structure combinations,k-nucleotide frequencies(k-mer frequencies)and Circ RNA2 Vec features are used to extract amino acid sequence features,structural features,contextual features and continuous distributed semantic features,respectively.In addition global features are achieved by combining the local features of each viewpoint.Then,after extracting the initial features of the local and global views,the initial features of each view are learned using a hybrid deep network containing two core modules,the deep multiscale residual network(DMSRN)and Bi GRUs with a self-attentive mechanism.The deep features corresponding to each initial view are extracted and later the local deep features and global depth features are constructed.Further,the proposed multi-view deep feature data is used to provide multi-view training data for the downstream classification model.Based on the multi-view deep feature data,a rule-based multi-view TSK fuzzy system classifier with better transparency and interpretability is introduced to construct a classification module for circ RNA-RBP binding sites.Our experimental study shows that the prediction performance of LGMK method outperforms other comparative methods by making full use of local and global features. |