| Traditional Chinese medicine(TCM)data is the main knowledge resource of TCM,which contains rich clinical experience knowledge.As the carrier of TCM knowledge and information stored in the form of text,using machine learning method to study these TCM data can save a lot of human cost,improve the objectivity of TCM and promote TCM related knowledge better.At present,Chinese medicine doctors mainly observe and understand the patient’s condition through four aspects: looking,smelling,asking and cutting,then dialectically get the syndrome type,and finally carry out the prescription.This makes the diagnosis of disease and prescription recommendation depend heavily on the diagnosis experience of doctors.In order to reduce the mistakes that may occur in the diagnosis of doctors,the use of data mining technology to eliminate the false,retain the true,eliminate the rough and extract the essence of TCM prescription data can greatly promote the development of modernization of TCM.The main work of this paper is as follows:(1)In view of the problems of missing values,duplicate names and nonstandard terms in TCM prescription data and drug data,data preprocessing is carried out,and the main treatment information of prescription data is standardized by using the method of building synonym word set of TCM,which provides necessary conditions for the research of TCM prescription recommendation methods.(2)To solve the problem of multiple words and one meaning in the description of symptoms in TCM,this paper proposes a method of symptom standardization based on BERT.By introducing the BERT model which integrates the knowledge of TCM,a good semantic representation of symptom can be obtained.The experimental results show that the proposed method is superior.(3)Aiming at the problem of manpower and time cost of supervised syndrome labeling,this paper proposes a prescription recommendation algorithm based on mutual information clustering,which measures the relationship between symptoms through mutual information,and then combines the results of symptom clustering with the search algorithm to achieve the effect of recommendation and screening.In this experiment,by inputting different symptom combinations to the model and analyzing the recommended prescription,the results show the feasibility of the proposed method. |