With the extensive research on the microbial community,the influence of the structure of the microbial community in the human digestive tract on human health has received widespread attention.However,most of the existing research focuses on the research between a certain type of disease and the intestinal flora and lacks a systematic collation.Therefore,it is of great significance to develop a predictive analysis system that can mine and sort otu the relationship between the structure of human intestinal flora and diseases.The existing microbiome sequencing analysis operation process is relatively cumbersome,and the accuracy of some key processes in some sequencing processes needs to be improved,and there is an explosive increase in the number of documents related to the structure of the intestinal microflora,common diseases,and so on.Some of the disease associations included are not systematically sorted otu and discovered.This article will study the relationship between intestinal flora and diseases from a variety of perspectives.The specific research content of this paper is as follows:First of all,it provides a strong guarantee for the input data of the fully connected neural network DNN research.First,the microbial metagenomic sequencing data and the16 s amplicon sequencing data are analyzed and processed.In this regard,the development of a pathogen sequencing data analysis pipeline that simplifies the sequencing data analysis process has realized the service that the pathogen sequencing data analysis process can be run through a simple interactive interface withotu repeating the input of Linux instructions.After that,based on the more mature Usearch framework at this stage,a set of 16 S amplicon sequencing data processing pipeline was implemented.The reads.fq file of the original sequencing data was processed step by step to obtain the OTU table that can describe the species abundance information for subsequent research Input data.Secondly,in view of the lack of systematic sorting otu of some disease associations and rules hidden between the intestinal flora and diseases,a knowledge map based on a large number of medical documents is proposed,including data crawling of documents,data processing such as word segmentation,Clause,part-of-speech tagging,part-ofspeech reduction,removal of stop words and special characters to achieve knowledge extraction,the key to further identifying five biomedical literatures: genes,diseases,drugs,bacteria,and disease phenotypes based on when two entities co-occur Entities are stored in the Neo4 j graph database through the representation method of the attribute graph to form a complete medical knowledge graph,and the hidden knowledge is aggregated to establish the human intestinal flora and disease analysis database,which is convenient for the relationship between the disease and the flora Inquire.Finally,the fully connected neural network DNN in machine learning is used to predict the complex relationship between flora structure and disease,and the comprehensive analysis of multiple flora levels is used to ensure the accuracy and completeness of the prediction.The experiment uses a 10-fold cross-validation method to compare and evaluate the prediction results of the three methods of DNN,and CNN on six data sets.It is confirmed that DNN has better results in disease prediction on species abundance,which proves this method The superiority.Based on the above research content,this paper designs and implements a system for predicting the relationship between human intestinal flora structure and disease.Stepby-step tests verify the effectiveness of the proposed method and meet the design goals of this paper. |