| Traditional Chinese medicine(TCM)is the summation of our ancient people’s experience in exploring and overcoming diseases,and is the crystallization of the wisdom of our ancient healers and sages.It is of great research significance and value to study and inherit the culture of Chinese medicine and to explore the potential value of Chinese medicine.However,due to the huge knowledge system and loose knowledge structure of TCM,it is difficult for modern people to learn and utilize TCM knowledge;moreover,the complex treatment process of TCM is difficult to adapt to the fast-paced living condition of modern society.Search engines,as the main way to obtain knowledge in the Internet era,have problems such as non-answers and redundant search results by means of keyword search,which are difficult to meet users’ needs for accurate knowledge acquisition.These problems have become the shackles affecting the modernization and development of TCM.How to manage TCM knowledge efficiently,reduce the difficulty of learning,and improve the efficiency of TCM access is an urgent problem for the modernization development of TCM.In this paper,to address the above problems,we combine the cutting-edge knowledge mapping and pre-trained model technology with Q&A system,and design and build a Q&A system based on TCM knowledge mapping to provide users with convenient and accurate TCM knowledge services.The specific research work is as follows.(1)To address the problem that TCM knowledge is loosely distributed and difficult to learn and utilize,this paper designs and forms a TCM diagnosis and treatment knowledge map with diseases,symptoms,symptoms,drugs and prescriptions as the main entities,containing 39,492 entities and 253,654 entity relationships,by referring to several existing TCM knowledge map structures and using Neo4 j database for knowledge map storage The knowledge map is stored and visualized using Neo4 j database,which provides the data basis for the design of Q&A system.(2)To address the problem of complex and variable user questions and difficulties in machine understanding in TCM,this paper implements named entity recognition and entity relationship extraction based on BERT and ALBERT pre-training models in a pipeline approach.For the named entity recognition task,this paper adopts the annotation method of BIO to annotate five categories of entities,namely disease,evidence,symptom,medicine and prescription,and collates 34,973 annotated entities,and conducts comparison experiments with BiLSTM-CRF as the baseline model.The experimental results show that the named entity recognition methods based on ALBERT-CRF and BERT-CRF models achieve 84.12% and 88.63% F1 values,respectively,and the BERTCRF model improves about 4.5% compared to the baseline method.For the entity relationship extraction task,this paper designs the annotation of 7,784 training utterances in 7 categories based on the entity relationships and entity attribute relationships in the knowledge graph structure and the TCM medical interrogative intent dataset.The experimental results show that the entity relationship extraction methods based on ALBERT and BERT models achieve F1 values of 70.47% and 77.28%,respectively.(3)Based on the above two research results,this paper designs and implements a Q&A system based on TCM knowledge graph using Flask-MySQL architecture.The system contains modules such as knowledge Q&A,FAQ retrieval,and Q&A detail display.The knowledge question and answer module can provide timely answers to users’ questions about diseases and symptoms in TCM,and give recommended treatment plans for common drugs and prescriptions.The FAQ search module is able to view the adopted Q&A records of users and supports fuzzy search based on keywords.The Q&A details display module is a supplement and extension of the knowledge Q&A and FAQ,which can display the detailed process of parsing the user’s question and the complete answer given by the system,and visualize the knowledge graph data based on the ECharts plugin. |