| In the era of rapid growth of digital information,people are increasingly relying on access to information from the Internet.For medical health,people often enquire the disease and symptom information from the Internet to understand their health status and prevent or treat them in time.However,the traditional way of retrieving information through search engines is very inconvenient because it requires users to design and refine keywords,and even further screening can get answers,which is difficult to meet the requirements of people to quickly obtain medical and health information.This thesis is aimed to research and develop a question answering system for the health care field to assist medical services and help users quickly understand their health so that people can know when they go to the hospital.The main work of this thesis is as follows.Firstly,in order to express the variety of medical problems,it is difficult to extract the disease symptom entity from the problem by using the rule template.A semantic similarity calculation method is proposed,which is mainly composed of editing distance-based method,character overlap coefficient-based method and word vector-based method.The experimental results show that the effect of the hybrid method is better than that of the three methods alone,which indicates that the method can effectively extract the disease symptom entities with similar semantics outside the dictionary.Secondly,consideration of query objectives for database queries,a multi-classification model of intent recognition is designed.The model is trained by Naive Bayes algorithm.In the test experiment,the optimal F1 value of the multi-classification model reached 0.9686,indicating that the multi-classifier can effectively identify the query intent type of most user input information.Finally,there are usually multiple answers to medical questions.There is no single standard answer.It is difficult for the question answering system to choose the best answer.A question answer matching model based on attention mechanism and character embedding is proposed.The model involves character embedding technology,attention mechanism and multi-scale convolutional neural network technology.Through the experiments on the question and answer database,it is proved that the question answer matching model designed in this thesis is superior to BiLSTM(bidirectional long-term memory)and single-scale convolutional neural network in the Chinese medical question answer matching task,which also indicates that this model can meet the requirements of this system.Based on the realization of the above three method models,this thesis completes the construction and performance test of the medical question answering system.The performance tests show that the recall rate and accuracy of the system have reached 85.0% and 76.0%respectively,which means the system achieves a good question answering effect. |