| Sequential diagnosis and treatment,the main method of clinical diagnosis and treatment of chronic diseases,is a complex decision-making optimization process,including multiple clinical diagnosis and treatment stages.In view of the complexity and individuality of TCM prescriptions,the TCM diagnosis and treatment for chronic diseases,also as a typical complex sequential diagnosis and treatment,includes the following steps:collecting iteratively four types of diagnostic information,discriminating disease status or diagnosis,and making decisions of prescriptions.Therefore,the discovery and intelligent application of excellent sequential diagnosis and treatment schemes of traditional Chinese medicine is the core issue of clinical artificial intelligence of traditional Chinese medicine.However,due to the complexity of the problem and the requirement for complete closed-loop data,the previous research on clinical data mining of traditional Chinese medicine seldom involves this aspect.With the accumulation of high-quality clinical data of TCM and the rapid development of Reinforcement Learning methods(especially Deep Reinforcement Learning)in recent years,it is possible to optimize the sequential diagnosis and treatment scheme of TCM based on reinforcement learning.In this thesis,optimization method of sequential diagnosis and treatment project in traditional Chinese medicine model(AlphaPrescriber)is proposed,which can make recommended prescription to patients based on their original observation of symptoms,and predict their symptom observation of next stage according to the observation of their pre-medicated symptom and the present medication in the treatment,dynamically form an optimized sequential diagnosis and treatment of TCM,and provide a basis for the application of artificial intelligence recommended by individualized prescriptions of TCM.The main research work includes the following aspects:First of all,to deal with the deficiency of natural Reinforcement Learning environment in the process of TCM diagnosis and treatment,a model of HU HE Deep TCM Treatment Artificial Environment(TAE)is proposed,which can construct "environment" of the Reinforcement Learning with the current medical data of a certain disease,and predict the patient’s symptom observation of next stage according to the observation of his/her pre-medicated symptom and the present medication in the treatment.The evaluation indexes based on coronary heart disease data are as follows:accuracy 98.4%,accuracy 97.0%,recall rate 96.5%,and F1 value 96.8%.The evaluation indexes based on diabetes data are as follows:accuracy 87.5%,accuracy 78.49%,recall rate 78.5%,and F1 value 73.3%.Secondly,combined with TAE,we developed an AlphaPrescriber,which applies the Deep Reinforcement Learning algorithm to the optimization of TCM sequential diagnosis and treatment scheme,offers intelligent prescription recommendation based on observed patient symptoms.Based on the test set of coronary heart disease,the average discount reward value of TCM sequential diagnosis and treatment optimization scheme model is 11.38,which is 9.15 by traditional Q-learning algorithm,while the average reward value of prescription given by clinicians is 7.16.Based on the test set of diabetes mellitus,the average discount reward value of sequential diagnosis and treatment optimization scheme model of traditional Chinese medicine is 11.50,which is 7.06 by traditional Q-learning algorithm,while the average reward value of prescription given by clinicians is 4.42.Relevant experimental results show that the evaluation index of traditional Chinese medicine treatment artificial environment is high,and the evaluation effect of sequential diagnosis and treatment model based on deep reinforcement learning is higher than that of traditional reinforcement learning and clinician’s proposal. |