| ObjectiveBased on the large amount of fragmented infertility knowledge recorded in ancient Chinese medical texts and the related articles recorded in the knowledge,to study the construction of a knowledge graph on the subject of infertility and the related methods of organizing,mining,recommending and utilizing the related articles,to open up a new path for the rational organization of infertility knowledge in ancient Chinese medical texts,the discovery of knowledge inheritance relationships,and the exploration of recommendations,to provide ideas for efficiently assisting the establishment of the knowledge system of infertility in ancient Chinese medical texts,and to promote the effective inheritance and good development of knowledge in ancient Chinese medical texts.Methods1)Knowledge representation and knowledge modelling research:Using a large amount of infertility knowledge recorded in the ancient Chinese medical texts included in the Chinese Medical Canon as the source of knowledge,we obtained and derived the articles with infertility knowledge by searching as comprehensively as possible and conforming to the article selection criteria and article boundary determination specifications.We then analyse the knowledge representation characteristics of infertility,summarize,and describe the knowledge elements contained in the articles,such as concept types,attribute types,semantic relationships between concepts and the knowledge representation structure they jointly constitute,and form a semantic-based knowledge representation model.The semantic knowledge representation model is formed.In order to form a quantitative understanding of the relationships between concepts and between concepts and attributes,a semantic co-occurrence analysis of the correlations among knowledge elements is conducted,and the co-occurrence frequency is used as an aid to further clarify the core concepts and important attributes in each knowledge representation type and to reasonably select various knowledge elements for modelling.The concepts,attributes and semantic relationships in each sub-topic are continued to be designed to form sub-topic knowledge models,and the sub-topic models are integrated with the assistance of the knowledge ontology construction tool protege to finally construct a knowledge model that can reflect the knowledge system of infertility in ancient Chinese medicine,laying the foundation for the subsequent construction and mining exploration research.2)Research on the construction of the knowledge graph:For the obtained articles related to infertility in ancient Chinese medical texts,the required entities are annotated through the doccano text annotation platform,and with the help of HanLP,a natural language processing toolkit,Python code is written to realize knowledge extraction and named entity recognition based on rules and custom dictionaries,and relevant entities are obtained by combining with manual proofreading.Under the constraints of knowledge normalization rules and the assistance of national or industrial standards,the alignment of entity knowledge is realized through the combination of synonym string matching and manual verification to provide a more reliable quality knowledge source for knowledge graph construction.On the above basis,the Labelled Property Graph(LPG)knowledge storage model,which is more suitable for representing the relationships with weights,is selected,and the RDF triad model is converted according to certain conversion rules,the constraints on the range of values taken by the properties are added,and the relationships with properties are updated to form the property graph models corresponding to the five major sub-topics.With the reasonable processing of null values and array type attribute values by batch import function and Cypher statement,the knowledge graphs of sub-topics and fused infertility topics are constructed and stored in Neo4j.Then,we conduct an exemplary study on the discovery of knowledge inheritance on this basis to reveal the implied inheritance relationship between the articles.3)Research on knowledge mining methods based on graph algorithms:In order to further explore the methods of discovering articles with high importance and worthy of attention in terms of inheritance relationships,and to understand the inheritance relationships among multiple articles under the same topic and to sort out the development of knowledge,the PageRank algorithm,a relevant graph algorithm for mining important articles,is selected according to the characteristics of the graph properties of the knowledge graph,and the PageRank algorithm,a path for mining inheritance relationships among multiple articles,is selected.The maximum spanning tree algorithm is chosen to explore the path of inheritance relationship between multiple articles.Through the graph algorithm support provided by the Graph Data Science Library(GDS)in Neo4j,we write relevant Cypher statements,set algorithm parameters,and perform calculations,and analyse the obtained results in combination with professional knowledge to comprehensively judge the feasibility of relevant methods.4)Recommendation exploration of the knowledge graph:Based on the knowledge graph of infertility constructed in the previous study,the PageRank score attribute of article nodes was added,and the weighting relationship between articles indicating the path of the maximum spanning tree was added to explore the recommendation application of core concepts,important articles,and articles closely related to the specified articles.Results1)Knowledge representation and knowledge modelling research:Through a comprehensive search of the Chinese Medical Dictionary,2623 valid data were obtained and exported,and the infertility-related articles were divided into five major sub-topics,including traditional Chinese medicine,prescriptions,acupuncture,diagnosis,and medical theory.Based on the analysis of knowledge elements and knowledge structure,it was found that the knowledge elements contained in the sub-topics of traditional Chinese medicine,prescription and acupuncture were relatively consistent and most of the knowledge structures were similar;the distribution and structures of some knowledge elements in the sub-topics of diagnosis were similar.According to their characteristics,they are divided into the category of the QikouJiudao pulse,the category of general pulse diagnosis method,and the category of lookout diagnosis;the medical theory category is characterized by the free expression of the provisions,involving a large variety of concepts and a complex and varied knowledge structure,lacking certain regularity;and there are many cases of cross-referencing between the knowledge within the same topic type.Further,through the co-occurrence analysis between concepts and attributes,it was found that the important attributes under the sub-topic of Chinese medicine were taste,medicinal properties and toxicity,the important attributes under the sub-topic of prescription were composition,usage and precautions,the important attributes under the sub-topic of acupuncture were acupoint positioning and acupuncture method,and the important attributes in the sub-topic of diagnosis and medical theory were the various attributes designed.The concepts and attributes incorporated into the sub-topic knowledge model were identified through qualitative and quantitative analysis,and the corresponding inter-concept relationships were designed according to the semantics,and finally the knowledge modelling of the five major sub-topics as well as the information of ancient texts,articles and authors was conducted,as well as the integrated knowledge model was established on this basis,containing 11 related concepts,17 related attributes and 12 inter-concept relationships.2)Research on the construction of knowledge graph:Through entity extraction based on rules and custom dictionaries,a total of 10,527 required non-normative entities were obtained.For entity alignment,the etiology and pathogenesis were summarized and classified,the dynasties were partially integrated and sorted,and the extracted non-normalized entities were aligned under the constraints of relevant standards and normalization rules,and finally 697 entities of normalized representations were obtained.In the process of processing the articles,it was found that later generations of doctors would often directly or indirectly quote the contents of their predecessors’ works when they wrote their works,and the text fragments formed by the quoted parts retained their predecessors’cognition,thus realizing the inheritance of their cognition by later generations of doctors,and similar inheritance is common,thus forming a vast and repetitive ocean of knowledge.If we can study this inheritance and explore its history,it will be helpful to quickly grasp the development of knowledge,which will help to better establish the relevant knowledge system and promote the inheritance of ancient TCM knowledge.Therefore,we propose the understanding of "knowledge inheritance",the discovery process and the related metric:"Knowledge Inheritance Degree(KID)" and explain the calculation method.The Knowledge Inheritance Degree(KID)is calculated and stored in the knowledge graph as the weight of the relationship between articles.The knowledge model of each subtopic is transformed into an attribute graph model that meets the storage requirements,and the attribute type and the range of attribute values are limited.The index of concepts is designed,and the relationship between articles with the knowledge inheritance degree as the weight and the relationship between prescriptions with the similarity of prescriptions as the attribute are updated.The sub-topic knowledge graph is constructed by batch import,in which the sub-topic knowledge graph of traditional Chinese medicine has 670 nodes of 6 categories and 1389 relations of 5 categories;the sub-topic knowledge graph of prescription has 3100 nodes of 7 categories and 17566 relations of 7 categories;the sub-topic knowledge graph of acupuncture has 626 nodes of 7 categories and 1497 relations of 6 categories The sub-topic knowledge graph of diagnosis method has 8 types of nodes,269 nodes in total,7 types of relations,511 relations in total;the sub-topic knowledge graph of medical theory has 6 types of nodes,578 nodes in total,6 types of relations,1211 relations in total;and the sub-topic knowledge graph of infertility formed after fusion has 11 types of nodes,4889 nodes in total,13 types of relations,21987 relations in total.On the constructed sub-topic knowledge graph,an exemplary study of knowledge inheritance discovery was conducted,and it was found that the knowledge inheritance degree has a good effect in evaluating the inheritance of knowledge among ancient texts of different eras,which can facilitate scholars to quickly understand the context of knowledge development and provide a strong support for the establishment of relevant knowledge system.3)Research on knowledge mining methods based on graph algorithms:Taking the knowledge graph of traditional Chinese medicine as an example,the PageRank algorithm provided by the Graph Data Science Library(GDS)of Neo4j is used to write Cypher statements to mine the importance ranking of the articles under the topic and combine the content of the top-ranked important articles and their inheritance relationships in the We analysed the content of the top-ranked important articles and their inheritance relationships in the knowledge graph,and found that the PageRank algorithm could be used to mine the important articles in the limited topic.In the knowledge recommendation path mining,we select the important articles with top importance ranking and only out degree but not in degree as the path starting point,use the knowledge inheritance degree as the weight of the edge in the path,use the Prim algorithm in the maximum spanning tree to mine,and obtain the corresponding maximum spanning tree or maximum spanning forest,and conduct professional analysis with the content of the articles,and confirm that the method can achieve effective recommendation and provide learning convenience for scholars.4)Recommendation exploration of knowledge graph:Based on the knowledge graph that has been mined,a list of important core concepts under the sub-topic is recommended based on the sub-topic specified by the user,and important articles are recommended in order of importance when the user specifies the core concept of interest,and the highest article is recommended based on the article of interest in terms of comprehensive evaluation of the inheritance and importance of relevant knowledge.ConclusionThe construction of the knowledge graph of infertility in ancient Chinese medical books can well organize and integrate the fragmented knowledge and its sources(paragraphs)and assist in establishing a more comprehensive and complete knowledge system;the discovery of inheritance relationships among paragraphs and the mining of key paragraphs and knowledge inheritance paths based on them provide new ideas and methods to recommend knowledge with focus and continuity.1)The knowledge of infertility in ancient Chinese medical books is scattered,and the core concepts,attributes and relationships are obtained from qualitative understanding to quantitative analysis based on topic classification,which is the basis of knowledge modelling,and knowledge modelling is a necessary link for the construction of domain knowledge graph.It can significantly improve the quality and interpretability of the domain knowledge graph.The advantage of the knowledge graph lies in the support of complex relationships and high scalability,so it is the right choice to use the knowledge graph to organize the knowledge of ancient Chinese medical infertility,which also makes the integration from multiple sub-topics to the general topic more convenient and faster.2)Most of the domain knowledge graphs focus on the knowledge obtained after organizing and standardizing,and often neglect to make good use of the sources of such knowledge,or only display them as a node in retrieval and visualization,but they are not only the sources of knowledge,but also the inheritance relationship implied between the articles contains rich knowledge,and mining the inheritance is helpful to form a coherent cognition of the domain knowledge,and also It is also helpful to understand the context of the knowledge,so as to form a grasp of the key knowledge of infertility-related topics more quickly.3)The essence of knowledge graph is graph,and the analysis and mining of it can be realized by suitable graph algorithms,and the mining of implied correlations based on graph algorithms can help discover key articles and core knowledge in the knowledge graph.Therefore,the correlation between the knowledge discovered by path mining and the chain of its composition can assist in the recommendation from a key article to the next key article,and the comprehensive and appropriate importance evaluation index can make the recommendation more credible and lay the foundation for the subsequent application of the knowledge graph. |