| According to statistics,the prevalence of heart disease in China has continued to rise for many years.Heart disease has become a public health problem.In order to detemine the type of heart disease,it it necessary to carry out electrocardiogram(ECG)examination on patients in its clinical diagnosis and treatment.ECG results contain a lot of in-depth medical knowledge,but it cannot fully cover the knowledge of heart disease.At the same time,the health websites provide a lot of shallow medical knowledge of heart disease.With the popularity of artificial intelligence and the rise of knowledge graph technology,it is possible to extract and organize medical knowledge from text data.How to make full use of the existing ECG results and medical knowledge provided by medical health websites to construct a heart disease knowledge graph which supports auxiliary diagnosis is a challenge which needed to be solved in current research.For the problems mentioned above,based on the results of ECG and medical knowledge provided by medical health websites,natural language processing technology is used to construct the heart disease knowledge graph.And this thesis implements a heterogeneous graph classification model through a multi-level attention mechanism to complete the auxiliary diagnosis task.The major work of this thesis is as follows:First,a heart disease knowledge graph is constructed.In order to fully display the information of heart disease,the results of ECG and medical health websites data are used as medical knowledge data sources.Driven by medical knowledge data,the entity and relationship types of the heart disease knowledge graph are determined,and the knowledge graph schema layer is formed.In order to solve the problem of too many entity types and lack of corpus,a cascade network model is designed to improve the performance of entity recognition:entity name recognition through the BERT-Bi LSTM-CRF network;entity type division through the BERT-MLP network.The experimental results show that the precision,recall,F1 score of the cascade network model on the heart disease entity recognition task is better than the BERT-LSTM model,BERT-Bi LSTM model and BERT-Bi LSTM-CRF model.Based on entity categories and corpus features,manually construct rules to extract relationships between entities.Use Neo4j graph database to complete the persistent storage of the heart disease knowledge graph.Second,based on the heart disease knowledge graph,a heterogeneous graph classification model is proposed to complete the task of heart disease disgnosis.Rely on the constructed heart disease knowledge graph,the auxiliary diagnosis task is analyzed and defined as a heterogeneous graph classification problem,and the graph dataset is constructed.Through the multi-level attention mechanism,a heterogeneous graph classification model is designed to complete the auxiliary diagnosis task of heart disease.The experimental results show that the accuracy of the heterogeneous graph classification model on the constructed graph dataset is better than the models based on graph2vec,R-GCN and GNNs. |