| Development of high-throughput sequencing has rapidly promoted the study of animal gut microbiome.The explosive growth of sequencing data needs more efficient analysis methods,and machine learning has unique advantages for microbiome analysis and modeling.Although the Qinghai-Tibetan Plateau has a bad climate and sparsely populated,the special environment also creates characteristic biodiversity.Therefore,the exploration of animal gut microbiome can further promote the protection of animal biodiversity in the Qinghai-Tibetan Plateau.Non-invasive and non-destructive sampling were used here.And then,based on 16S rRNA high-throughput sequencing,this study conducted data mining on the structure and diversity of fecal microbiota using computational methods such as feature extraction,random forest,logistic regression,and sparse logistic regression.Different machine learning models were trained and compared,and then significantly different biomarkers(feature genera)were extracted.The research contents of this study can be summarized as follows:(1)In view of the controversy over whether mammalian fecal samples can replace intestinal contents for exploring microbiome,carrying out data analysis and mining research.Small intestine(jejunum)and large intestine(cecum and colon)contents,and feces from 6 sheep with similar weight and age were collected and sequenced by 16S rRNA sequencing.The results showed that non-invasive fecal samples could be used to predict the large intestinal microbiota rather than equal to its composition.Then,sampling feces rather than large intestinal contents is an effective and low-cost method for characterizing microbiota of large intestine.At the same time,by recombining sample data and training and evaluating classifiers such as K-nearest neighbors,random forest,support vector machine,logistic regression,and sparse logistic regression,it was determined that the random forest model is more suitable for small-sample,imbalanced,binary classification microbiota data.(2)Research on the characteristics of ruminant intestinal microbiota in the Qinghai Tibet Plateau was conducted based on traditional machine learning and deep learning methods.A total of 340 feces from naturally grazing yaks,cattle,yak-cattle hybrids,and Tibetan sheep from 4 ecological regions in the Qinghai-Tibetan Plateau were collected for 16S rRNA sequencing data analysis and biomarker screening.Based on clustering,dimensionality reduction,Spearman analysis,the intestinal microbiota of ruminants in the Qinghai-Tibetan Plateau were divided into 2 enterotypes:Ruminococcaceae UCG-005 enterotype and Acinetobacter enterotype,which could be conducive to potential disease prediction and dietary analysis.Therein,Acinetobacter enterotype was first discovered in the studies of animal gut microbiota and yak enterotype in the Qinghai-Tibetan Plateau.Moreover,by assessing the relationship between fecal microbiota and the above variables,a scattered pattern of fecal microbiota dissimilarity was identified based on environment over others.Furthermore,by training and comparing models such as deep neural networks,Transformer,Knearest neighbors,support vector machine,logistic regression,sparse logistic regression,and random forest,it was determined that the random forest model had the best predictive performance,followed by the Transformer model.Additionally,through feature importance analysis,several influential biomarkers,such as Lysinibacillus,were identified in yaks and Tibetan sheep.(3)Through 16S rRNA sequencing,to carry out data mining study on the changes of horse fecal microbial community structure and diversity with diarrhea,and screen biomarkers(feature genera)by random forest model,from the perspective of microorganisms.A total of 16 fecal samples from diarrhea and healthy Qaidam horse in Qinghai-Tibetan Plateau were collected,and after analyzing the fecal microbiota of these samples,the results indicated that the fecal microbial community structure and diversity of Qaidam horse with diarrhea or health were different.Meanwhile,4 biomarkers(feature genera)were screened by random forest model.Therein,Methanobrevibacter,Fibrobacter,Carnobacterium and Elusimicrobium have the potential to become the therapeutic target of diarrhea in Qaidam horse.(4)Conducted research on the relationship between the 3 zoonotic parasites and the fecal microbiota structure,composition and diversity in grazing Bactrian camels in Qinghai-Tibetan Plateau.The 16S rRNA sequencing data of Bactrian camel fecal samples(n=38)infected with 3 zoonotic parasites in Qinghai-Tibetan Plateau were analyzed,and biomarkers were extracted by linear discriminant analysis(LEfSe).The results indicated that the infection of 3 parasites did not significantly affect the fecal microbiota diversity of Bactrian camels,but had a great impact on the relative abundance of some phyla and genera.Meanwhile,compared to samples that not infected with parasites,the feature genera with significant differences were screened:Ruminococcus and Akkermansia(samples infected with Enterocytomon bieneusi),Oscillospira,Prevotella and Akkermansia(samples infected with Giardia dudeonalis),Dorea(samples co-infected with Cryptosporidium spp.and Enterocytomon bieneusi),respectively.To sum up,this study not only determined the feasibility of non-invasive sampling(fecal samples)for gut microbiome research but also trained various machine learning models based on different biological grouping datasets.It was affirmed that the random forest model had the best predictive performance in this study.Moreover,potential biomarkers(feature genera)were identified,which could become new targets for the diagnosis,prevention,and treatment of diarrhea and parasites in plateau animals.In addition,this study provided a fecal microbiota profiling of ruminants living in 4 ecoregions of the Qinghai-Tibetan Plateau. |