| Recently, the research of microbiome technology has become an important method for studying the problems of human health and ecological environment. The microbial community is closely related to human health and many researchers begin to focus on the study of human microbial community. With the development of the second generation sequencing technology and the third generation sequencing technology,the microbiome data shows the geometric growth speed. The stability and specificity of microorganisms is a significant problem in the study of human microbial community,and the properties of microorganisms can be used for individual character encoding, so the practical problem in identity verification and forensic identification shows the potential value.The microbial community is specific and stable. First of all, the microbial characteristics of each sample were extracted by using the data of microbial community.then, the minimal hitting set of each individual is calculated by greedy strategy using the idea of minimal hitting set called as metagenomic codes. We contrast the metagenomic codes with the abundant data after second sampling. If these individual features are encoded in a sampled microbial community, they can be matched correctly. In order to improve the correct matching rate of metagenomic coding, the main works are as follows:Firstly, this article put forward the metagenomic encoding algorithm based on relative abundance values. Due to the fact that the main characteristics of microbial characteristics in the community are related to the abundance of microbial characteristics,we should not extract features only according to the characteristics of the difference between the abundance of microorganisms, but also pay attention to the value of their abundance microbial characteristics on the stability of the microbial characteristics.Therefore, this paper proposes a method based on abundance value to identify individuals. In order to verify the validity of the metagenomic encoding algorithm based on relative abundance values,this paper makes a comparison between the 4 datasets based on the algorithm and the algorithm based on the difference of abundance. The experimental results show that the metagenomic encoding algorithm based on relative abundance values has a correct matching ratio on the Markers and Kbwindows data sets,the results show that there are 9 data on OTUs data sets are better.Secondly, in this paper, we propose a joint metagenomic codes identifying algorithm based on abundance and abundance difference, which combines the advantages of the two algorithms according to the merits of the two algorithms proposed, and through the method of heuristic parameter control, which allows the algorithm to achieve the best correct matching rate. The experimental results focus on the four types of data combined with the abundance of metagenomic encoding algorithm the higher difference in individual identification correct matching results, which improve the correct matching rate of metagenomic encoding to recognize individuals. |