Construction And Analysis Application Of Medical Case Report Literature Library Based On Big Data Technology

Posted on:2022-03-10

Degree:Master

Type:Thesis

Country:China

Candidate:X Y Mu

Full Text:PDF

GTID:2494306560491314

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

With the widespread application of computer technology in the medical field,how to use big data development and artificial intelligence technology to mine the key information in medical case literature is still a challenge.Exploring data processing programs of different scales,designing highly available data warehouses,providing accurate analysis results,and implementing a simple and easy-to-use visualization platform are all difficult points for a medical big data development program with high reference value.This paper is based on the free full-text database of biomedical and life science journal documents of the National Institutes of Health and the National Library of the United States to mine text information.In order to manage the library data in a standardized way,the paper draws on the data processing experience of the first-line Internet companies in the process of mass data development,and innovatively introduces data from the three links of data ETL,data link component selection and deployment process,and data analysis and visualization.Warehouse flat modeling method,containerized deployment distributed development environment,configurable data analysis visualization platform to build a complete medical case literature data warehouse and efficient and fast data processing link.In the process of preliminary technical research,we have made an in-depth understanding and comparison of the commonly used technology stacks in storage,calculation,query,analysis and visualization.Among them,kylin,impala and hive distributed services were actually deployed,and the query performance on tens of millions of data sets was actually tested.In the text processing part,the medical case report literature data is cleaned and extracted with the help of commonly used machine learning preprocessing code libraries and natural language processing algorithms to achieve reasonable splitting of unstructured text data and effective information mining,and is based on self-developed User-defined functions build a complete data warehouse of medical case literature.Based on the constructed data warehouse and algorithm classification results,it can meet the analysis of the research field,funding institution,publication country,publication time period,etc.of the literature,and the word frequency statistics of each module keyword,and it can also support multi-dimensional and multi-index Multi-table joint query analysis.Finally,based on the superset with rich display styles and support for mounting of multiple data sources,the document time distribution,research field distribution,geographical distribution,and keyword word cloud were visually displayed,and the overall distribution of the document database data was visually displayed.The medical big data development program studied in the thesis has new reference significance in complex and massive text data processing and content mining.

Keywords/Search Tags:

Data Warehouse, Medical Bigdata, Data ETL, Data Visualization, Text Mining

PDF Full Text Request

Related items

1	The Study Of Some Problems In Medical Imaging Data Warehouse System
2	Big Data Analysis Of Electronic Medical Records Based On Data Warehouse
3	Application Research Of Data Mining Technology In The New Rural Cooperative Medical Information System
4	Using HTML5 To Design And Implementation Of Medical Health Data Visualization Tool
5	Research And Implementation Of Data Extraction-Conversion-Loading (ETL) System In Medical Data Warehouse
6	Research And Implementation Of Medical Data Mining Visualization System
7	Design And Implementation Of Data Processing And Analysis Of Rehabilitation Equipment Based On Big Data
8	Design And Implementation Of Drug Quality Management And Evaluation Based On Data Warehouse
9	Design And Implementation For Decision Support System Of Drug Administration Based On Data Warehouse
10	The Design And Implementation Of The Hypertension Management Data Warehouse In The Clinical Data Center Of County-level Hospitals