Font Size: a A A

Research On Optimization Of FP-Growth Algorithm Based On Cloud Computing And Medical Big Data

Posted on:2019-10-24Degree:MasterType:Thesis
Country:ChinaCandidate:C HuangFull Text:PDF
GTID:2394330563959582Subject:Engineering
Abstract/Summary:PDF Full Text Request
As we all know,the current rapid development of technology and continuous penetration in various fields,so that the amount of data in various fields soared.In this context,humans have discovered valuable knowledge hidden in massive data based on data mining method.As an important branch of data mining,association rules mining has been highly valued.In recent years,the technology has matured.Based on massive amounts of medical data,how to effectively use this method to draw internal association rules from them,so as to form information conducive to the prevention of diseases,evaluation of drug treatment effects,and monitoring of clinical diseases.In short,the study of this topic is of great significance.The current types of chronic diseases such as diabetes,high blood pressure,and their complicated complications have laid a dangerous signal to human health,and have brought endless pain to patients while increasing social burden.Therefore,it is of great significance to do a good job in preventing diseases and treatment.However,because diseases often have very variable and complex pathogenesis,they cannot be accurately diagnosed in advance.However,the formation and development of any kind of disease has certain rules(trajectories).By assessing the condition of patients,it is helpful to reasonably formulate interventions and then reduce the harm of the disease to the patient's body.It is of great significance to construct a chronic classification and decision-making model that can meet the needs of preventive medicine and can help doctor diagnose and clinical treatment based on data mining technology.However,prior art methods highlight inapplicability in growing medical data.Based on this more timely application of existing related technologies and distributed environment,it is useful to draw information from the massive information that helps prevent and cure chronic data.This is the origin of the study and the significance of this research.In the research of this topic,we will use FP-Growth algorithm to mine and analyze medical big data.To solve the problem of low efficiency of traditional FP-Growth algorithm mining in large-scale data environments,an improved FP-Growth algorithm is proposed.It divides the database subset based on the basic concept of frequent item set division,and directly building FP-tree based on various conditions can greatly reduce the problem of occupying memory space.In addition,the two-dimensional table shows the degree of support and its support count,which helps to promote the efficient operation of the algorithm and reduce the process of querying the server and database data once.In order to further optimize its performance,the FP-Tree of the classic FP-Growth algorithm is pruned through the item consolidation strategy,so as to improve the efficiency of algorithm mining.The organic combination of the improved FP-Growth algorithm and the distributed computing framework Hadoop's Map Reduce programming model further improve the mining efficiency in the big data environment.Experiments show that the efficiency of the improved FP-Growth algorithm based on Hadoop is higher than that of the traditional FP-Growth algorithm.
Keywords/Search Tags:Data Mining, Association Rules, FP-Growth, Hadoop, MapReduce
PDF Full Text Request
Related items