TCM Big Data Resources Data Warehouse Construction And Prescription Analysis Application Research

Posted on:2022-10-22

Degree:Master

Type:Thesis

Country:China

Candidate:J L Wu

Full Text:PDF

GTID:2504306560490514

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Traditional Chinese medicine(TCM)embodies the profound philosophy and wisdom of the Chinese nation for thousands of years of healthy idea,in the long-term clinical practice has accumulated rich and valuable resource,these resources various kinds and great amount of data and widely distributed in the field of traditional Chinese medicine,how to fully integrate resources,utilization and management of these data is the problem of traditional Chinese medicine.TCM prescription is an important part of TCM theory,method,prescription and medicine.It is formed by drug selection and compatibility on the basis of syndrome differentiation and treatment.Based on large-scale clinical data,effective core prescription and potential drug compatibility for disease treatment can be found to effectively assist clinical decision support.However,at present,traditional methods are still used to store and calculate TCM data,which has low scalability and is easy to reach bottlenecks.To solve this problem,this paper will effectively combine big data technology,machine learning,complex network and other algorithms to conduct distributed mining of massive clinical data.This paper mainly includes the following contents:(1)Based on the CDH(Cloudera’s Distribution Including Apache Hadoop)big data platform,completed the construction of the data warehouse of TCM big data resources.Firstly,a system structure combining top-down and bottom-up is proposed to make the logic structure of data warehouse more clear.At the same time,the multi-source data is collected into HDFS,the characteristics of the data and the relationship between them are analyzed,and the subject domain model and multi-dimensional data model are designed.Then,ETL tasks were developed using Spark,Hive QL and other technologies and ETL workflow was configured through the Dolphin Scheduler to complete the mapping of multi-source data to the data warehouse,which currently contains nearly 340 million records and about 351 GB of data.Finally,Kylin was used to construct the data cube according to the formula theme,and the multi-dimensional OLAP analysis demonstration research was carried out.The data warehouse has the functions of multi-source data integration and data processing,as well as Web multi-dimensional analysis and data mining.(2)Based on the data warehouse of TCM big data resources,the distributed mining of TCM clinical effective prescriptions was completed.Firstly,clinical diagnosis and treatment data of COPD patients are extracted from the data warehouse to form a data mart.Then,according to the patient’s treatment is divided into effective and ineffective group,and propensity score matching method is used to eliminate confounding bias between the two groups,according to effective group,extract the prescribing information construction of drug compatibility and through multi-scale backbone network algorithm to extract the core drugs subnet,through the effective prescription drug concentration analysis method(P<0.05),165 effective prescriptions were found,with an effective ratio of 80.88%,which could be used as the core prescription for the treatment of COPD.Finally,the effective drug and disease knowledge was extracted by conditional mutual information method.(3)The distributed mining research on the compatibility law of TCM prescriptions was carried out.In order to efficiently mine association rules in TCM prescriptions,a distributed CHARM algorithm was proposed in this paper.The algorithm,based on the Spark framework,effectively solved the problems of low efficiency and memory overflow of traditional methods.Aiming at the problem of the large number of association rules,this paper proposes a distributed compression algorithm to obtain fewer and more representative association rules.The experiment shows that the obtained association rules have a very good guiding significance in clinical practice.

Keywords/Search Tags:

Data Warehouse, Machine learning, Complex Network, Association Rules, Distributed Charm

PDF Full Text Request

Related items

1	Research Of The Reliable Platform Of Distributed Machine Learning For Medical Data Based On Blockchain
2	Distributed Parallel Machine Learning Algorithms And The Application In Biomedical Field
3	Risk Prediction Of Hypertensionbased On Machine Learning Methods Andgenotype Data
4	Implicit Association Discovery And Measurement Of Health Medical Data
5	Study Of Prestigious Traditional Chinese Physicians’ Medication Rules Based On Machine Learning And Data Mining
6	Building A Local Data Warehouse And Primary Application Using Network Data In Hospitals
7	Research On Complex Noise Induced Hearing Loss Prediction Model Based On Machine Learning
8	Prediction Of LncRNA-disease Association Based On Heterogeneous Network And Machine Learning
9	Analysis Of The Prescription Rules Of Electric Acupuncture For Peripheral Facial Paralysis Based On Complex Network And Data Mining Technology
10	Analysis And Prediction Of Risk Factors For Anxiety And Depression After Ischemic Stroke Based On Machine Learning And Complex Network Models