Font Size: a A A

Mining Probiotic Genome Molecular Markers And Constructing A Visual Screening Prediction Platform Based On Machine Learning

Posted on:2022-08-25Degree:MasterType:Thesis
Country:ChinaCandidate:Y SunFull Text:PDF
GTID:2480306509454234Subject:Bio-engineering
Abstract/Summary:PDF Full Text Request
Probiotics are widely favored by consumers because of their beneficial effects on the host,including regulating the immune function and the gastrointestinal microbiota.Although some probiotic strains have been extensively used,the screening of potential probiotic strains is still not accurate enough.In the traditional development process of probiotics,complicated experimental verification required long experimental period.Therefore,it is essential to use bioinformatics methods to quickly screen effective probiotics through big data technology and establish a visual prediction platform.In this study,machine learning algorithms are first applied to the molecular marker mining of probiotics and constructing a predictive screening platform.We selected 239 probiotic strains and 412 non-probiotic strains,characterized by the K-mer(2-8-mer)marker of the whole genome sequence.We passed the feature selection algorithm,F-score and incremental feature selection(IFS)screened 184 core features in the genomes of all strains.The machine learning support vector machine(SVM)method was used to construct a probiotic prediction model.The prediction accuracy rate reached 97.7%during the 10-fold cross-validation process,proving that the core K-mer feature could be used as a potential molecular screening marker for the probiotic genome.Furthermore,the functional annotations of core feature-related genes indicate that a single gene does not determine the probiotic effect,but a systematic and networked synergy between genome molecules,and a total of 60 core functional roles shared by 239 probiotic strains have been marked.Finally,we again verified the credibility of the above prediction model from the characteristics of probiotics intestinal colonization,carbohydrate metabolism,drug resistance,and virulence factors.Subsequently,by extracting 53 K-mer from the core features,we further constructed a probiotic classification prediction model for Lactobacillus,Bifidobacterium and other probiotics.The independent test set were tested,and the results were consistent with the verification results of the phylogenetic evolution analysis,which further proves that the core K-mer have stable predictive performance in the accurate classification and screening of probiotic genomes.Finally,we establish iProbiotics,an online visual screening and prediction platform for probiotics,including whole genome sequence,molecular markers and related genes,BLAST search,online prediction model to provide professional bioinformatics services and reference for researchers engaged in probiotics research.
Keywords/Search Tags:probiotics, machine learning, support vector machine, K-mer, functional genomics, incremental feature selection
PDF Full Text Request
Related items