| Chronic diseases have become the worst-hit areas in the medical field with insidious onset,long course of disease and many complications.It plays a positive role in the prevention and control of chronic diseases and the reduction of social medical burden by popularizing the knowledge of chronic diseases.The knowledge graph is widely used because it can effectively represent the relationship between various things in the objective world.Question answering system based on knowledge graph provides concise and accurate answers on the basis of understanding user intention.In this thesis,knowledge graph construction and question parsing technology are studied in depth,aiming to build a chronic disease knowledge graph,and on this basis,intelligent question answering is realized.The main research contents of this paper are as follows.(1)Construction of chronic disease knowledge graph based on top-down.A topdown approach suitable for industry knowledge graph construction was used to construct the knowledge graph of chronic diseases.After analyzing the web data,the concept layer of knowledge graph is defined.A web crawler based on spliced URL is designed to obtain initial data.The formatted data were obtained through data cleaning and stored in the Neo4 j graph database,and a knowledge graph containing 2411 kinds of chronic diseases was successfully constructed.(2)Research on BiLSTM-CRF Named Entity Recognition Model Based on Character-Word Associations.In order to solve the problem of insufficient representation of semantic information by character vector and the fact that word vector will ignore the semantic information of characters within a word,a vector representation optimization method of character-word association is proposed,and the context feature learning and label constraint are realized by using BiLSTM and CRF models.Finally,a medical entity recognition model of BiLSTM-CRF interrogative questions based on character-word association is proposed.In the process of model training,aiming at the problem of lack of medical question labeling corpus,a rule-based automatic labeling corpus generator was designed.The experimental results show that the performance of the recognition model in this thesis is much better than that of the common recognition model,and the accuracy rate is 90.08% and the recall rate is88.95%.(3)Parameter design of question classification model based on text CNN.Template matching is to convert the parsed questions into Cypher database query language and carry out natural language packaging on the answers according to the classification results of questions.The precise classification of questions is the key to the success of question answering.In order to solve the problem of question multi classification,this thesis takes the classification model text CNN as the benchmark model,and analyzes the main parameters of the model filter_size is optimized to realize question classification.The experimental results show that when the filter_size is(3,4,5),the model performance is good,and the overall classification accuracy of test data under this parameter is greater than 80%,and some of the test data reaches more than 90%.(4)Implementation of intelligent question-answering prototype system based on Flask.In order to verify the effectiveness of the question parsing technology in the thesis,an intelligent question answering prototype system was built using the Flask framework,and 300 questions with slow disease were used to test the core question answering function of the system.The test accuracy reached 96%,which fully proved the effectiveness of the overall question parsing technology and the feasibility of the system design. |