| As one of the most abundant modifications of RNA methylation,the methylation of N6-methyladenosine(m6A)actually involves many biological processes,including brain developmental malformations,human obesity,protein translation,RNA splicing,RNA degradation,and RNA stability.Current research indicates that high-throughput sequencing or wet experiments can be used to detect m6 A peaks.However,these approaches cannot accurately locate the m6 A methylation site.With the advent of the artificial intelligence era,people began to explore the use of machine learning to identify m6 A methylation sites.This paper proposes and constructs two predictors for RNA sequence-based m6 A methylation site recognition in different directions: M6 A methylation site recognition studies using ensemble learning and m6 A methylation site recognition studies based on multi-species datasets.In the study of m6 A methylation site recognition using ensemble learning,a predictor called M6APred-EL was proposed.This paper modified and adopted the feature representation methods with three different directions,which are physicochemical information,Chemical structure information,and position specific information.The three feature extraction methods are based on global position-specific polynucleotide propensity,based on chemical properties and local position-specific,based on physicochemical properties of dinucleotide information table extraction.It is worth mentioning that a comparative experiment has been performed on polynucleotide of global position-specific polynucleotide propensity in this study.It has been proved that the performance is best when the number of polynucleotide equals 1.Therefore,global position-specific polynucleotide propensity is adopted as one of the feature extraction methods of the three base classifiers.The M6APred-EL predictor is based on the majority of the voting of three base classifiers(corresponding to three feature extraction methods mentioned above respectively).The results show that compared with the latest research results,the proposed method’s performance is better than 1.18%in accuracy and 0.03 in matthews correlation coefficient.In the m6 A methylation site recognition study based on multi-species RNA sequences,we proposed and constructed a sequence-based predictor called M6 AMRFS.Feature extraction is performed by means of dinucleotide binary code and dinucleotide cumulative position frequency statistics.Then,F-score is used to select features,and then the data is input into Extreme Gradient Boosting(XGBoost)to obtain the final classifier model.Through a variety of comparative experiments,including classifier comparison experiments,feature selection and non-feature selection comparison experiments,the latest feature extraction algorithm comparison experiments,and the latest predictor comparison experiments,we can find that M6 AMRFS predictor can be well evaluated on multiple data sets.In general,M6 AMRFS is superior to or equivalent to the stat-of-the-art researches. |