Font Size: a A A

Heterogeneous Information Network Representation Learning For Disease Prediction

Posted on:2022-07-11Degree:MasterType:Thesis
Country:ChinaCandidate:Z C SunFull Text:PDF
GTID:2494306308499654Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The application of Electronic Health Records(EHR)has promoted the transformation of the research in the healthcare domain to data-driven and has brought new opportunities for more personalized health guidance.Disease prediction is one of the important research contents of big data in the healthcare domain.The goal is to evaluate and predict the patient’s health status based on the patient’s personal information and historical electronic health profiles,and assist medical institutions in diagnosis to improve its efficiency and quality,reducing the cost of medical treatment for patients.Based on the big data analysis in the healthcare domain,data mining and machine learning methods have been applied in disease prediction tasks in recent years,and have achieved good prediction results.However,the existing methods generally rely on the temporal feature of patients’ historical medical rerecords.These methods take less consideration of some potential relationships in medical codes,such as the relationship between disease and other complications,and the relationship between disease and symptoms.These relationships in medical codes might be significate in healthcare domain research,such as querying and discovering related disease information from massive data through key symptoms,or using historical diagnosis analyze the patient’s future health status,and so on.In this paper,we focus on the complicated relationship between medical information contained in the electronic health records,building heterogeneous medical information networks from the EHR and medical knowledge data,to learn the latent relationship between medical information.Firstly,in this paper,we introduce an innovative model based on Graph Neural Networks(GNN)for disease prediction,which utilizes external knowledge bases to augment the insufficient EHR data,and learns highly representative node embeddings for patients,diseases,and symptoms from the medical concept graph and patient record graph respectively constructed from the medical knowledge base and EHRs.By aggregating information from directly connected neighbor nodes,the proposed neural graph encoder can effectively generate embeddings that capture knowledge from both data sources and is able to inductively infer the embeddings for a new patient based on the symptoms reported in her/his EHRs to allow for accurate prediction on both general diseases and rare diseases.Secondly,we introduce a novel model based on a self-attentional medical path reasoning network for disease prediction,which utilizes medical paths extracted from patient EHR and external medical knowledge bases to augment the latent interaction between diseases and learn highly representative patient embeddings.By explicitly incorporating medical paths,the proposed path encoder can effectively generate embeddings that capture the hierarchical information of diseases,and the self-attentional reasoning network is able to learn effective representations of a patient based on the historical patient admission sequences in her/his EHRs to allow for accurate disease prediction for the next hospital admission.Finally,we design experiments to evaluate the two disease prediction methods.By integrating the above two disease prediction methods,the proposed diagnosis system could make disease prediction according to a patient’s symptoms or his/her historical medical records.We collect the EHR data from local medical institutions and public intensive care medical information database,combining with relevant medical knowledge,construct heterogeneous medical information networks,and develop a disease diagnosis system based on heterogeneous medical information networks.Extensive experiments on public medical datasets have demonstrated the state-of-the-art performance of our proposed model.
Keywords/Search Tags:Disease Prediction, Heterogeneous Information Network, Graph Representation Learning
PDF Full Text Request
Related items