Font Size: a A A

Data-driven Precise Diagnosis And Treatment Model Of Traditional Chinese Medicine CHIN A ACADEMY OF CHINESE MEDICAL SCIENCES

Posted on:2020-09-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:H Q ZhaoFull Text:PDF
GTID:1364330578970372Subject:Chinese medical science
Abstract/Summary:PDF Full Text Request
Data-driven refers to the decision-making support method with data as the dominant factor.In a broad sense,all behaviors are data-driven.In a narrow sense,the opposite of data-driven is experience-driven.After three processes of data acquisition,data modeling and data analysis,chaotic data can be transformed into the result of decision-making support.The idea of individualized medical treatment embodied in "Precision Medical Treatment" is in line with the traditional Chinese medicine,which is based on people,place and time.Therefore,the"accurate" diagnosis and treatment proposed in this study refers to the interpretation of the accuracy of traditional Chinese medicine on the relationship between data and data.By improving and introducing artificial intelligence methods such as machine learning,the feature extraction and visual combing of classical documents and cases of traditional Chinese medicine and related data information,which contain rich knowledge resources of traditional Chinese medicine,can realize the different states and stages of "disease-syndrome".Accurate classification,finding the precise corresponding law of "syndrome-disease-syndrome-treatment",further optimizing the medical process of syndrome differentiation and treatment,and improving the clinical efficacy of TCM are the cross-research results of objectification,standardization and informationization of TCM diagnosis and treatment.Objective:To explore the method of multi-source and heterogeneous data fusion in traditional Chinese medicine,to study the method of transforming text data of traditional Chinese medicine from empirical data to objective data,to design a calculation method based on text feature processing,to realize the automatic processing of the four diagnostic materials of traditional Chinese medicine,and to accomplish the differentiation and treatment of the four diagnostic text data of traditional Chinese medicine in a quantitative way.Taking the text characteristic data of TCM as the main research object,this paper searches for the correlation between TCM diseases and TCM syndrome types,discovers the regularity and relationship between TCM symptoms,diseases,syndrome types,treatment methods and prescriptions,and constructs a precise diagnosis and treatment model of TCM guided by data-driven thinking.Method:(1)Collecting the concepts,definitions,symptoms,treatments,and prescriptions of TCM diseases and syndromes in the textbook of TCM Internal Medicine,collecting the relevant terms and symptoms and signs of TCM in the Dictionary of Traditional Chinese Medicine,and deduplicating the data.The pretreatment process,such as normalization,combines the corresponding results of the basic work of the Ministry of Science and Technology to form a multi-source heterogeneous data set of TCM.(2)Using the jieba word segmentation tool to segment the Chinese medical text data,and realize the transformation of text data from unstructured to text vector.(3)Using TF-IDF algorithm and TextRank algorithm to extract the keyword features in the text vector respectively,and calculate the feature weights,and use the Precision,Recall,and F1 evaluation methods to evaluate and select the extracted calculation results,thereby realizing multi-source heterogeneity.Fusion processing of data.(4)Using mathematical analysis method to propose a correlation algorithm for the diagnosis and treatment of TCM diagnosis and treatment,and using Visio Studio 2015+C#language to achieve the accuracy of the correlation coefficient,which can combine the symptoms of the disease and the symptoms of each syndrome corresponding to each disease.Characteristics and their characteristic weights,through the method of feature selection and weighted calculation,quantify the process of disease differentiation and syndrome differentiation,thus achieving the objectification of syndrome differentiation and treatment,and obtaining accurate results of syndrome differentiation and treatment,and realizing the accurate diagnosis and treatment of TCM.(5)The concepts and symptoms of diseases,syndromes and symptoms in Internal Medicine of Traditional Chinese Medicine were extracted with the thought of TCM disease-syndrome combined diagnosis and treatment mode.The method(3)was used to extract the features and calculate the weight.The disease-characteristic correlation and disease-syndrome-characteristic correlation were obtained.The disease-syndrome-treatment-prescription relationship was combined to construct the model of TCM disease differentiation and the model of TCM syndrome differentiation and treatment.In addition,a precise diagnosis and treatment model of TCM based on data-driven is constructed by combining the correlation calculation method of TCM diagnosis and treatment characteristics.(6)Visualize the model content and perform network topology analysis using Cytospace software.Result:(1)In the data collection stage,a total of 69 TCM names were obtained,corresponding to 366 syndrome types,366 corresponding treatments,366 prescriptions,22989 conceptual nouns and symptomatic signs,totaling 138,336 words,and constructed a multi-source heterogeneity of traditional Chinese medicine.data set.(2)Using the data from the "sick name" and "concept" fields in the TCM diagnosis and treatment data set to construct the TCM disease text dataset,covering the names of 69 TCM diseases and their corresponding disease definitions in the textbook of TCM Internal Medicine.Symptoms.After segmentation of the TCM disease text data set,127-ID valid features were obtained using the TF-IDF algorithm,and a total of 241 effective weights were obtained from 0.6.A total of 862 effective features were obtained using the TextRank algorithm,and a total of 534 weights were significant.After evaluating the results of the two algorithm models,it was decided to use the results of the TextRank algorithm model as the data of the TCM disease model.After completing the disease feature extraction and weight calculation,the disease is organically combined by features to explore the relationship between disease and disease,and the results of TextRank algorithm model calculation are imported into cytoscape software,with the name of TCM as the target node.As the source node and the weight as the edge,the characteristic network of cold disease,the network of lung disease characteristics and the network of all disease characteristics were established.(3)Using the five fields of "sick name","certificate type","symptoms","governing method"and "party medicine" from the data collection of traditional Chinese medicine,covering the names of 69 kinds of Chinese medicine diseases in the textbook of TCM Internal Medicine.And all the concepts,symptoms and corresponding prescription and drug information data of 366 syndrome types were used to construct the TCM syndrome type text data set.After the TCM syndrome text data segmentation,the TF-IDF algorithm was used to obtain a total of 6194 effective features,with a weight greater than 0.6.741,using the TextRank algorithm to obtain a total of 3,490 effective features,a total of 2,553 weights greater than 0.6.After evaluating the results of the two models,although TextRank and TF-IDF have higher accuracy in extracting feature words,TextRank is far less in feature extraction than TF-IDF,considering the precise model in the later stage.The weight in the calculation is also the main influencing factor.In view of the higher weight of TextRank calculation,the features and weights extracted by TextRank are used as the model basic data when constructing the dialectical treatment model.After completing the relevant feature extraction and weight calculation of the syndrome type,the characteristics of various diseases and various plastics are combined to explore the relationship between the disease and the syndrome,and the results of the TextRank algorithm model calculation are imported into the cytoscape software.The TCM syndrome type is used as the target node,and the feature is used as the source node and the weight as the edge.The syndrome type network of the cold,the syndrome type network of the lung disease,the syndrome type network of the TCM internal medicine,and the disease characteristic network are established.The fusion network of the syndrome type network finally obtains the "sickness-proof-feature" association network,which can complete the process of syndrome differentiation and treatment from feature matching and weight calculation.(4)Based on the data-driven TCM precision diagnosis and treatment model,based on the TCM syndrome model and the TCM syndrome combination syndrome differentiation model,the TCM diagnosis and treatment feature correlation algorithm is integrated.The accuracy of the TCM diagnosis and treatment model is reflected by the correlation coefficient in the TCM diagnosis and treatment correlation algorithm,which is to use the quantitative method to calculate the subjective judgment process,so that the syndrome differentiation process is converted from the calculation probability to the calculation correlation,and finally the similarity(Correlation coefficient)as a reference for the evaluation of accuracy.By inputting the four diagnostic data of Chinese medicine,the model can automatically analyze the relevant characteristics of traditional Chinese medicine in the data,and distinguish the disease and syndrome through the TCM syndrome model and the syndrome differentiation model,and calculate the final syndrome differentiation result through the correlation algorithm.After three types of TCM medical records test,the results of completely correct,biased,and partially correct syndrome differentiation are achieved.After testing 60 cases of TCM medical case samples,it is found that the model based on the textbook data of "Internal Medicine of TCM"has a diagnostic accuracy of 10%for modern TCM medical cases and 60%for modern TCM medical cases.The accuracy of the model is in line with expectations,and can realize syndrome differentiation and treatment for multi-source heterogeneous text data.(5)The research is innovative in the following aspects:a.By establishing a dictionary of TCM-specific vocabulary data,introducing jieba tools to Chinese word segmentation of TCM text data,and using TF-IDF algorithm and TextRank algorithm to extract TCM keywords in segmentation results.Feature and calculation of feature weights to achieve vectorization of TCM text data.b.For the first time,a correlation calculation method for TCM diagnosis and treatment features based on TCM text features and feature weights is proposed.By calculating the number of vector features and feature weights,the correlation between feature sets and TCM diseases and syndromes can be obtained.To establish a TCM----correlation relationship based on characteristics and weights.c.Based on the data-driven thinking,a dynamic and open TCM precision diagnosis and treatment model is constructed.The TCM syndrome combination diagnosis and treatment model is the basic structure.The disease diagnosis model,the syndrome differentiation model and the correlation calculation module are organically composed.The four diagnostic materials of TCM first diagnosed the disease and then differentiated the syndrome,and realized the accurate diagnosis and treatment by calculating the correlation between the four diagnostic data and the disease and syndrome,and finally output the TCM name,syndrome type,treatment method and prescription for the four diagnostic materials.Conclusion:(1)Chinese word segmentation is an important method and tool for studying Chinese medicine big data.The descriptive language of TCM can be changed from sentence to vocabulary by means of Chinese text segmentation,which makes it easier for computers to "understand" text data.(2)TF-IDF algorithm and TextRank algorithm can extract feature keywords in TCM text data,and can calculate feature weights.TF-IDF can extract more features,but the average value of weights is lower,TextRank extraction The number of features does not have more TF-IDF,but the average of the weights is higher.(3)By constructing a feature-based TCM disease model and a syndrome differentiation model,it is found that there are strong correlations between different diseases and syndrome types in TCM.By calculating the correlation,various diseases and syndromes of TCM can be organic.The connection between the two places has realized the objective expression of TCM diagnosis and treatment,and at the same time verified the "holistic concept" of TCM.(4)The TCM syndrome model and TCM syndrome differentiation model constructed in this study can complete the process of TCM diagnosis and syndrome differentiation by means of feature matching and weight calculation of TCM four diagnostic data.The whole process can be traced back,and each process is In a quantitative manner.(5)The TCM diagnosis and treatment model based on data-driven design can realize the multi-source heterogeneous treatment of TCM,and can learn TCM "Law,Law,Fang and Medicine"knowledge under supervised conditions to realize TCM artificial intelligence.The model can understand the descriptive TCM four-diagnosis data,and automatically extract the text data needed in the process of syndrome differentiation and treatment.After the feature extraction and correlation calculation,it can output the diagnosis and treatment results of the four-diagnosis data,which is the objectification and standardization of TCM diagnosis and treatment.Informatization research has made certain contributions.
Keywords/Search Tags:Data-driven, Accurate diagnosis and treatment model, Combination of disease and syndrome, Text vectorization, Machine learning, Information technology of traditional Chinese medicine
PDF Full Text Request
Related items