Font Size: a A A

Research On Representation Learning And Application Of Electronic Health Record Data

Posted on:2021-11-06Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y J FengFull Text:PDF
GTID:1484306542996519Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the joint promotion of national policies,industry capital and intelligent infor-mation technology,healthcare has entered the era of big data.With the development of medical informatization,the healthcare data accumulated in the electronic health record(EHR)system shows characteristics of large scale,various data types and high velocity of generation.Representing of EHR data effectively and establishing intelligent computa-tional models according to the needs of medical problems are the key to the research and application of big healthcare data,which is of great theoretical significance and practical value.Representation learning of EHR refers to learning a low-dimensional and dense representation of data from the original health records.The high dimensionality,sparsity,temporality and heterogeneity of healthcare data as well as the professionalism of medical domain knowledge have brought great challenges to representation learning.The objective of this paper is to research on representation learning and intelligent applications of healthcare data in EHR.Aiming at the challenges and problems in previ-ous representation learning methods,we systematically carried out the following research work:Aiming at the problems of high dimensionality and sparsity of discrete medical con-cepts and the difficulty in combining medical domain knowledge,we proposed a rep-resentation learning framework with the integration of domain knowledge,including a multi-granularity embedding-based representation learning method and a multi-channel convolutional neural network.Experimental results demonstrated that our models can effectively capture inherent relatedness between medical concepts.Moreover,the inte-gration of domain knowledge solved the problem of data insufficiency when learning rep-resentations of rare medical concepts to some extent.Through the hybrid use of unsuper-vised and supervised learning methods,the multi-channel convolutional neural network not only learned better representations of medical concepts and patients but also achieved the best performance in the prediction of medical resource consumption.Aiming at the problems of temporality and irregularity of low-frequency clinical time-series data,we proposed a method for feature representation by calculating statistics based on temporal sampling window and applied it to the early diagnosis of acute kidney injury in the intensive care unit.Moreover,we also proposed two class-balancing methods based on a case-control matching strategy and an individualized predictive model respec-tively,which successfully improved the performance of models on class imbalanced data.Aiming at the problems of multisource,heterogeneity and asynchronous sampling of multimodal healthcare data,we proposed a fusing representation learning framework for multimodal data and then applied it to the prediction of mortality risk in the inten-sive care unit and prediction of total hospitalization cost.In order to capture the long-term dependency within medical time-series data,we firstly applied a long short-term memory neural network(LSTM)to learn the representation of low-frequency clinical time-series data and proposed a two-dimensional convolutional long short-term memory neural network(CNN-LSTM)to learn the representation of high-frequency electrocar-diogram(ECG)signal.Furthermore,we proposed two variations of double-core memory networks to learn the fusion representation of multimodal healthcare data.Experimen-tal results demonstrated that our double-core memory networks outperformed traditional fusing representation methods in both prediction tasks.In summary,we proposed multiple representation learning methods for EHR data to mine information in healthcare data,and to provide solutions to medical problems.
Keywords/Search Tags:Electronic Health Record, Representation Learning, Convolutional Neural Network, Double-core Memory Network, Multimodal Data
PDF Full Text Request
Related items