Font Size: a A A

Functional Classification Of Antimicrobial Peptides And Prediction Of Promoter Methylation Sites Based On Ensemble Deep Learning

Posted on:2022-12-28Degree:MasterType:Thesis
Country:ChinaCandidate:Y T ShaoFull Text:PDF
GTID:2480306611957919Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
Bioinformatics is a science that uses computers as tools to store,retrieve and analyze biological information in the study of life sciences.Among the various computational approaches in bioinformatics,ensemble learning and deep learning have independently made a substantial impact on the field of bioinformatics through their broad application ranging from basic nucleotide and protein sequence analysis to systems biology.Until recently,the rapidly growing synergies between these two popular technologies have attracted the development and application of a new generation of machine learning methods known as ensemble deep learning.The emergence of ensemble deep learning models has brought new ideas,algorithms,and frameworks,greatly enriching the old paradigm.Antibacterial peptides are small molecular polypeptides with immune activity produced by natural organisms themselves.They have a killing effect on bacteria,fungi,viruses and even cancer cells,and do not cause antibiotic resistance.They are potential drugs for the treatment of new coronaviruses.DNA methylation in human promoter regions is closely related to cancer,and has great clinical potential in cancer risk assessment,early diagnosis,scientific treatment,and prognosis monitoring.Therefore,in this paper,antimicrobial peptide sequences and human promoter methylation sites are selected as the application objects of ensemble deep learning,and the following research innovations have been achieved.In the application of antimicrobial peptide identification and function prediction,this paper integrates the existing antimicrobial peptide database to establish a newer and larger benchmark dataset.By introducing the "CNN-Bi LSTM-SVM" ensemble deep learning model and "Cellular automata images of proteins",developed a novel predictor "i AMP-CA2L".i AMP-CA2 L is a two-stage predictor,the first stage is used to determine whether a given unknown peptide is an antimicrobial peptide or not,and the second stage is a multi-label predictor used to predict its functional type.Leave-one-out validation shows that compared with existing predictors,i AMP-CA2 L has a higher recognition rate and a wider range of prediction functions,achieving 94.13% accuracy for the first level and 55.85% absolute accuracy for the second level.The public can freely access the i AMP-CA2 L online predictor at http://121.36.221.79/i AMP-CA2 L website,and the standalone predictor program has been uploaded to https://github.com/liujin66/i AMP-CA2 L.In the application of identification and prediction of human promoter methylation sites,three human promoter methylation site datasets were constructed from the Encyclopedia database of cancer cell lines in small cell lung cancer,non-small cell lung cancer and hepatocellular carcinoma.,a total of 3 million sample sequences.The sample sequence is encoded using frequency-based independent one-hot encoding,and the m5C-HPromoter predictor is established by integrating DNN,SVM,XGBoost,and Light GBM using a stacked ensemble deep learning model.Numerical experiments show that compared with the existing predictors,the m5C-HPromoter predictor is more powerful and stable,and achieves an accuracy of 92.7±0.22% for the identification of human promoter methylation sites.The public can access the m5C-HPromoter online predictor for free on the http://121.36.221.79/m5C-HPromoter website,and the independent predictor program has been uploaded to https://github.com/liujin66/m5C-HPromoter.
Keywords/Search Tags:ensemble deep learning, antimicrobial peptides, cellular automata, promoter methylation, 5-methylcytosine
PDF Full Text Request
Related items