Research And Application Of The Screening Mechanism Of Medical Big Data Based On MapReduce Technology

Posted on:2020-07-07

Degree:Master

Type:Thesis

Country:China

Candidate:J X Chen

Full Text:PDF

GTID:2404330602950154

Subject:Biomedical engineering

Abstract/Summary:

PDF Full Text Request

Screening medical data is not only an important part of medical work,but also an important application of data analyzing and data query technology in the medical field.Effective screening methods and query mechanism are helpful to the mining and utilization of medical data,and support data application requirements such as information statistics,personalized medical treatment,decision-making support,follow-up,drug discovery,health management,precision medicine,etc.There are two main challenges in screening medical data: first,because of the large amount of data,the storage and calculation of large-scale data exceed the performance limit of traditional relational database;second,second,because of the complexity of data structure types,screening medical data needs specific processing methods,especially unstructured data.According to the characteristics of current medical data,this paper studies the distributed computing method of clinical data.It mainly uses Hadoop technology,an open source big data tool,and combines MapReduce computing model to propose a parallel screening mechanism for multi-structured medical data.The mechanism adopts the idea of unified platform,classified processing and easy expansion,and incorporates multi-structured data into a unified MapReduce computing platform.This paper mainly implements the screening calculation of structured data,time series data and medical text data in the platform,to solve the distributed calculation and cross-structure screening of multi-structured data in medicine,In addition,it optimizes the query algorithm and improves the efficiency of screening.The main contents of this paper are as follows:1.Distributed storage and query optimization of massive medical structured data2.Distributed index creation and query optimization of massive medical time series data3.Distributed index and query of massive medical text data4.Screening platform for medical big dataThe innovations of this paper are as follows:1.This paper proposes a data warehouse technology Hive to realize distributed storage and query optimization of massive structured medical data,based on MapReduce architecture.2.This paper presents a distributed time series index DB-DSTree based on MapReduce.In this paper,a parallel DSTree index method based on DHD index is proposed,and the local grouping of batch queries can effectively solve the imbalance of DSTree,which can significantly improve the efficiency of batch queries.3.This paper presents a method of distributed storage and query of massive text based on MapReduce.4.Based on the distributed storage and query methods of medical structured data,medical time series data and medical text data,this paper establishes a screening platform for medical big data.

Keywords/Search Tags:

MapReduce, Hadoop, Hive, Data Screening, Medical, Time Series, Solr

PDF Full Text Request

Related items

1	Research On Optimization Of FP-Growth Algorithm Based On Cloud Computing And Medical Big Data
2	Analysis And Prediction Of Big Data Of Chinese Medicinal Materials Based On R+ Hadoop
3	Research On Time Series Medical Data Analysis Method Based On LSTM
4	Chinese Parallel LDA Algorithm Based On Hadoop And Data Mining In Electronic Medical Records
5	Research And Improvement Of Apriori Algorithm For Medical Cloud Data Based On Hadoop
6	Research On Medical Insurance Data Mining Based On Hadoop
7	Analysis And Research Application Of Hyperthyroidism Disease Model Based On Medical Big Data
8	Design And Implementation Of ECG Data Acquisition And Storage System Based On Hadoop
9	Analysis And Research Of Tumor Mode Based On Medical Big Data
10	Research On Analysis And Application Algorithm Of Health Monitoring Time Series Data