| ObjectiveUnder the guidance of the theoretical system of "knowledge representation method of ancient Chinese medical books based on knowledge elements",a systematic and networked knowledge organization is carried out for the leucorrhea literature of ancient Chinese medical gynecology books.Using the knowledge metadata of leucorrhea,feature expressions and feature words are extracted and the rules are formally defined,and the rule base of leucorrhea knowledge elements in ancient Chinese gynaecological books is constructed,exploring a new path for automatic extraction of knowledge elements in ancient Chinese medicine books.Method(1)Research methods of philologyIn this study,bibliographic research methods such as bibliography,edition science and collation science were used to screen and collect the leucorrhea literature of ancient Chinese gynecology books.Systematically search and collect ancient Chinese gynaecological books and documents related to leucorrhea through bibliographical methods,screen and confirm the target ancient books through the inclusion criteria,determine the ancient book base selected for this study through the method of edition,and uniformly proofread the electronic text of ancient books and documents through the method of collation.(2)Research Method of Ancient Books Digitization Based on Knowledge ElementThis study uploads the selected ancient books to the "TCM ancient book knowledge processing platform" through the knowledge element based research method of ancient books digitization,establishes the exclusive indexing template for the ancient Chinese gynaecological books,determines the metadata and semantic type used,and uses the template to analyze and index the selected bibliographic content,and obtains the corresponding knowledge body,knowledge element,semantic type and semantic association data.Based on the indexing metadata,the feature expression and feature words of the leucorrhea knowledge element of the ancient Chinese gynaecological books are extracted according to the attribute of the knowledge element,which provides the data source for the construction of the leucorrhea knowledge element rule base of the ancient Chinese gynaecological books.(3)Semantic analysisReferring to the metadata concepts and semantic components of ancient Chinese medicine books determined by the tutor’s research team,the semantic types and semantic associations contained in feature expressions and feature words of different attribute knowledge elements are sorted out through semantic analysis.For the feature expressions and knowledge elements with special semantic associations and semantic types,the causes are analyzed from the perspectives of traditional Chinese medicine philology and knowledge element theory,the impact of this phenomenon on the construction of knowledge element rule base of ancient Chinese medicine books is explored,and corresponding solutions are proposed for the areas that can be improved.(4)Computer information processing methodReferring to the formal representation commonly supported by the current mainstream computer program language,and combining the characteristics of knowledge element feature expressions and feature words of ancient Chinese medicine books,this paper studies and designs a formal language expressed by specific symbols,so that the computer can correctly identify and divide the knowledge element content of ancient Chinese medicine books with different attributes by using the feature expressions and feature words after the formal definition of rules.(5)Statistical methodsThe results of automatic extraction of knowledge elements of leucorrhea in ancient Chinese gynecology books were measured by statistical methods.Import the rule base of knowledge elements of leucorrhea in ancient Chinese gynaecological books into the "automatic extraction platform of knowledge elements in ancient Chinese medical books",and obtain and count the accuracy rate,recall rate,F1 value,bibliographic extraction integrity and system extraction integrity of the extracted knowledge metadata.Results(1)The research on the collation of leucorrhea literature of ancient Chinese gynaecological books based on knowledge elements:after screening by inclusion and exclusion criteria,a total of 54 ancient Chinese gynaecological books were selected.According to the content of the literature,a template for the knowledge indexing of leucorrhea in ancient Chinese gynaecological books was developed.The template contains 2 types of knowledge bodies,38 kinds of meta concept data,46 kinds of semantic types and 23 kinds of semantic associations.After indexing the knowledge elements of the content related to the disease,a total of 804 knowledge bodies were obtained,including 286 knowledge bodies of disease syndrome and 518 knowledge bodies of prescription.There are 4587 knowledge elements,of which 23 types and 2030 knowledge elements are indexed under the disease and syndrome knowledge body,and 10 types and 2557 knowledge elements are indexed under the prescription knowledge body.A total of 36 semantic types,15898,and 15 semantic associations,3694 were obtained.(2)Research on the construction of knowledge element rule base of leucorrhea in ancient Chinese gynaecological books:the key point of building the rule base is to extract the feature expression and feature word of knowledge element and define them formally.A total of 660 feature expressions and 535 feature words were obtained after feature expressions and feature words were extracted and summarized from 4000 knowledge metadata filtered by inclusion and exclusion criteria.Among them,there are 399 feature expressions of knowledge elements and 293 feature words under the knowledge body of disease and syndrome;261 feature expressions of knowledge elements and 242 feature words under the knowledge body of prescription.The result of formal definition of rules consists of four parts,including 16 formal definitions of specific symbol rules,5 formal definitions of constituent knowledge element rules,5 pairs of formal definitions of exclusion rules,and 5 formal definitions of optimization rules.After semantic analysis,the feature expressions under 9 attributes and a total of 29 knowledge elements have no corresponding semantic relevance to match;The feature expressions under 7 attributes and a total of 15 knowledge elements have no corresponding semantic types to match them;There are multiple semantic associations matching the feature expression under 1 attribute and 5 knowledge elements.(3)Experiment on automatic extraction of knowledge elements of leucorrhea in ancient Chinese gynaecological books:The completed rule base of leucorrhea knowledge elements in ancient Chinese gynaecological books was imported into the"Software platform for automatic extraction of knowledge elements in ancient Chinese gynaecological books",and the knowledge elements of 11 ancient Chinese gynaecological books and leucorrhea documents were automatically extracted,and a total of 552 knowledge elements were obtained.After manual review and statistics,the number of correct knowledge elements is 443,the overall extraction accuracy rate is 80.3%,the recall rate is 74.8%,the F1 value is 77.5%,the bibliographic extraction integrity is 93.2%,and the system extraction integrity is 81.8%.Sorting out the wrong extraction of knowledge elements and unrecognized knowledge elements,and analyzing the three reasons,namely,the wrong extraction range of knowledge elements,the wrong recognition of knowledge element attributes,and the lack of relevant rule matching.Corresponding to the above three situations,corresponding suggestions and measures were put forward and the rule base was further improved.After the second extraction,a total of 564 knowledge elements were obtained.After manual review and platform statistics,the number of correct knowledge elements was 470,the overall extraction accuracy rate rose to 83.3%,the recall rate rose to 79.4%,the F1 value rose to 81.3%,the bibliographic extraction integrity rose to 95.3%,and the system extraction integrity rose to 90.9%.ConclusionUnder the guidance of the theory of "knowledge representation method of ancient Chinese medical books based on knowledge elements",the analysis and indexing of related documents of leucorrhea in ancient Chinese medical gynaecological books can sort out and integrate fragmented knowledge in a more complete and comprehensive way.Before analyzing and indexing the knowledge elements of ancient Chinese medical books,the knowledge nodes can be customized according to the content or style characteristics of ancient Chinese books,which can make up for the defects of the fixed template indexing structure,help to reflect the knowledge characteristics of different ancient books and improve the indexing accuracy,and also provide a starting point for exploring the relationship between knowledge elements.Knowledge body indexing can enable researchers to further clarify the meaning of disease appellation and the practical significance of disease appellation in the past dynasties with the change of disease appellation,and with the continuous accumulation and enrichment of data,the study of disease history will be more convenient.Knowledge element indexing can systematically sort out and display the scattered knowledge in ancient books according to the different attributes of knowledge elements,which is helpful for clinical workers to quickly retrieve the required knowledge and establish a linked knowledge system,which can more easily find the relevance and inheritance of knowledge in the content of literature,making the mining of tacit knowledge possible,It also provides new ideas and methods for knowledge mining of ancient Chinese medicine books.Through the analysis and indexing of knowledge elements in ancient Chinese medicine books,it is found that different attribute knowledge elements have different feature expressions and feature words.Using feature expressions and feature words for knowledge element attribute matching can more accurately identify specific types of knowledge elements from the original text of ancient Chinese medicine books.Combining with the formulation of rules and formal definitions,it is helpful for the computer to determine and divide the content of knowledge elements with different attributes,So as to achieve the purpose of automatic extraction of knowledge elements from ancient Chinese medicine books.The extraction of knowledge element feature expression is based on the knowledge element indexing and according to the writing characteristics of the original text of ancient books.It not only has the grammatical characteristics of ancient Chinese,but also retains the original appearance of knowledge organization from the perspective of traditional Chinese medicine,and can reflect the topological nature of knowledge element,thus reducing computer misjudgment.The feature words of knowledge elements are extracted from the text of knowledge elements of various attributes,without unified and standardized treatment,but follow the original appearance of ancient books and fully reflect the lexical characteristics in different contexts.Compared with the existing semantic network framework of traditional Chinese medicine,the semantic types are richer,the semantic associations constructed are more complex,more fully reflect the original appearance of knowledge of ancient Chinese medicine books,and can also complement the existing vocabulary reference books of traditional Chinese medicine,Make the word selection more typical,and also ensure that the research content is more credible and persuasive.The two complement each other,and can reflect the knowledge expression methods of the content of ancient Chinese medicine books with different knowledge classifications.On the basis of extracting feature expressions and feature words,we also need to design a formal language to describe these rules through specific symbols and grammar rules,and form the final knowledge meta-rule base of traditional Chinese gynaecological books.The formal language should try to meet the characteristics of the knowledge element attribute and semantic network framework of ancient Chinese medicine books,so as to facilitate the computer program to understand and process the knowledge element content of ancient Chinese medicine books.On the basis of regular expressions,this research designs the corresponding formal definition of rules according to the content characteristics of ancient Chinese medicine books.It can support Boolean expressions,support the reference of dictionaries and feature thesaurus,support the nesting of feature expressions,also support the corresponding exclusion rules,and support the use of metacharacters to express special text formats.These features can improve the ability of computer to automatically recognize the knowledge elements of ancient Chinese medicine books with different attributes.The result of automatic extraction experiment shows that the method of building the knowledge element rule base of ancient Chinese medicine books can help to realize the purpose of automatic recognition and extraction of knowledge elements of ancient Chinese medicine books by computer.By sorting out the knowledge elements that were identified incorrectly or failed to be extracted in the automatic extraction results,analyzing the reasons for the incorrect identification or failure to be extracted,and taking corresponding corrective measures,and constantly improving the formal definition of the rules of the knowledge expression of ancient Chinese medicine books,the adaptability of the rule base can be expanded,and the accuracy and completeness of the automatic extraction of knowledge elements of ancient Chinese medicine books can be improved.Innovation points(1)This study combines traditional philological research methods with the research method of digitizing ancient books based on the knowledge element theory,and for the first time conducts knowledge element analysis,indexing,and collation analysis of ancient Chinese gynecological books on leucorrhea diseases.Before performing knowledge element parsing and indexing,the template indexing method was improved,and a dedicated template for knowledge element indexing of leucorrhea was designed.The core metadata and non core metadata in the indexing template were defined,and the interoperability and linkage between the metadata were clarified.A systematic and networked knowledge organization was conducted for ancient Chinese medical and gynecological books on leucorrhea diseases,It also provides research ideas for the collation of knowledge elements in ancient Chinese gynecology books and other types of ancient Chinese medicine books.(2)The components of the knowledge element rule base of ancient Chinese medicine books have been basically determined.Based on the results of knowledge element analysis and indexing,feature expressions and feature words of knowledge elements have been extracted,and analyzed through semantic analysis.Specific symbols,constituent knowledge elements,exclusion rules,and optimization rules have been formalized to form the knowledge element rule base of leucorrhea diseases in ancient Chinese medicine gynecology books.(3)Based on the original semantic association template and according to the characteristics of indexing data,semantic types are extracted for "crowd" knowledge elements,and semantic associations within such knowledge elements are constructed;The semantic association across knowledge elements has been constructed for the four knowledge elements of "interpretation","synonym","origin",and "citation",complementing and improving the semantic network of traditional Chinese medicine knowledge elements.(4)For the first time,a demonstration study was conducted on the automatic extraction of knowledge elements through the construction of a knowledge element rule base for leucorrhea diseases in ancient Chinese gynecological books.Based on the automatic extraction results,the corresponding feature expressions and feature words have been supplemented and improved,further improving the knowledge element rule base of leucorrhea diseases in ancient Chinese gynecological books.The experimental results show that using the knowledge element rule base to automatically extract knowledge elements from ancient Chinese medicine books is feasible,providing a paradigm for other types of ancient Chinese medicine books to automatically extract knowledge elements. |