Font Size: a A A

Research And Application Of The Screening Mechanism Of Medical Big Data Based On MapReduce Technology

Posted on:2020-07-07Degree:MasterType:Thesis
Country:ChinaCandidate:J X ChenFull Text:PDF
GTID:2404330602950154Subject:Biomedical engineering
Abstract/Summary:PDF Full Text Request
Screening medical data is not only an important part of medical work,but also an important application of data analyzing and data query technology in the medical field.Effective screening methods and query mechanism are helpful to the mining and utilization of medical data,and support data application requirements such as information statistics,personalized medical treatment,decision-making support,follow-up,drug discovery,health management,precision medicine,etc.There are two main challenges in screening medical data: first,because of the large amount of data,the storage and calculation of large-scale data exceed the performance limit of traditional relational database;second,second,because of the complexity of data structure types,screening medical data needs specific processing methods,especially unstructured data.According to the characteristics of current medical data,this paper studies the distributed computing method of clinical data.It mainly uses Hadoop technology,an open source big data tool,and combines MapReduce computing model to propose a parallel screening mechanism for multi-structured medical data.The mechanism adopts the idea of unified platform,classified processing and easy expansion,and incorporates multi-structured data into a unified MapReduce computing platform.This paper mainly implements the screening calculation of structured data,time series data and medical text data in the platform,to solve the distributed calculation and cross-structure screening of multi-structured data in medicine,In addition,it optimizes the query algorithm and improves the efficiency of screening.The main contents of this paper are as follows:1.Distributed storage and query optimization of massive medical structured data2.Distributed index creation and query optimization of massive medical time series data3.Distributed index and query of massive medical text data4.Screening platform for medical big dataThe innovations of this paper are as follows:1.This paper proposes a data warehouse technology Hive to realize distributed storage and query optimization of massive structured medical data,based on MapReduce architecture.2.This paper presents a distributed time series index DB-DSTree based on MapReduce.In this paper,a parallel DSTree index method based on DHD index is proposed,and the local grouping of batch queries can effectively solve the imbalance of DSTree,which can significantly improve the efficiency of batch queries.3.This paper presents a method of distributed storage and query of massive text based on MapReduce.4.Based on the distributed storage and query methods of medical structured data,medical time series data and medical text data,this paper establishes a screening platform for medical big data.
Keywords/Search Tags:MapReduce, Hadoop, Hive, Data Screening, Medical, Time Series, Solr
PDF Full Text Request
Related items