| Syndrome is the core of TCM syndrome differentiation.It emphasizes looking at the problems of parts of the human body from a holistic perspective,and organically integrates physiological,psychological and other factors.The pathogenesis of many diseases is multi-faceted,which has caused great trouble to the existing western medicine treatment system.The abstraction and complexity of syndromes are ideal for revealing the causes of these diseases and proposing unique treatment options.At present,basic research on syndrome biology has achieved some results in animal modeling and syndrome analysis of typical diseases.However,on account of the lack of syndrome-related data,especially the lack of syndrome molecular correlation,the research on syndrome molecular mechanism is very lagging behind.This paper firstly constructs a syndrome-related knowledge map and conducts correlation analysis on it.Then,based on the knowledge map,the research on the prediction of syndrome gene relationship and the analysis of syndrome molecular mechanism are carried out.The main work of this paper consists of the following three parts:(1)To carry out research on the construction and analysis of syndrome-related knowledge graphs.Firstly,the Chinese and English two parts of syndrome-disease and syndrome-symptom data were processed,and CUI codes were matched for diseases and symptoms to form 1,787 standardized syndrome-disease data and 2,132syndrome-symptom data.Fisher’s test was used to obtain 105,719 syndrome-gene data.Consistency analysis and other methods were used to verify the quality of syndrome-gene data.The results showed that there was a significant positive correlation between the similarity between syndromes based on genes and the similarity based on diseases or symptoms,indicating that the data quality was relatively reliable.Furthermore,the association network of syndromes based on disease similarity,the association network of syndromes based on symptom similarity and the association network of syndromes based on gene similarity were constructed and analyzed.(2)To carry out research on the methods of syndrome gene prediction.This paper proposes a syndrome gene relationship prediction framework PSGene that combines pre-training and knowledge completion.First,the prediction results of the three link prediction methods in different dimensions are tested,and we found that the performance of prediction is the best when the dimension is 200.Four pre-training methods are used to train entity vectors respectively,and then four pre-training methods are combined with three link prediction methods respectively.The test results show that PSGene_Line_Compl Ex has the best performance,which is 4.16% higher than the MRR of Compl Ex.Finally,the model fusion method is used to design three scoring functions.The experimental results show that the fusion result is 16.67% higher than the optimal result before fusion.(3)Combining network analysis methods to carry out analysis and research on network mechanism of typical syndromes.Molecular networks were constructed for the three diseases corresponding to liver-kidney yin deficiency and the three syndromes corresponding to bulging,and their core subnetworks were analyzed.Afterwards,molecular network association analysis and pathway enrichment analysis were carried out for different diseases under the same syndrome and for different syndromes corresponding to the same disease,revealing the same syndrome in terms of genes,subnet distances,subnet edges,and enrichment pathways.The close relationship between different diseases and between different syndromes of the same disease provides theoretical support for syndrome research. |