| Medical clinical data can be regarded as a collection of clinical events generated by patients during one or more hospitalizations.Clinical events include drug records,disease diagnosis records,physiological indicators,laboratory test results,non-text records(computed tomography,Electrocardiogram,sound-recording,etc.),Disease history,genetic history,diagnosis and treatment costs,etc.These clinical events are recorded in electronic medical record(EMR),and the complexity is now close to the genome scale.Through the analysis of multiple clinical events,researchers can flexibly predict the possibility of the patient’s future illness,which is of great significance for the early detection and treatment of diseases.However,precisely because of the complexity and diversity of clinical events,how to make better use of clinical events to predict future diseases is particularly challenging.The medical concept refers to clinical events that contain rich semantic information,such as drug records and disease diagnosis records.Unlike data such as blood pressure and blood glucose measurements,the medical concept implies semantic relationships,and the potential connections between events are very complicated.How to better construct the medical concept representation is the key to improve the accuracy of disease prediction.At present,researchers use medical concepts to predict disease diagnosis mainly face three challenges:First,medical concept representation,that is,how to effectively use the semantic information implicit in medical concepts.Most studies use one-hot vectors to represent medical concepts,which makes the input matrix highly sparse and causes the loss of rich semantics.Second,the time dependence of clinical events.Time information is particularly important for the development of the patient’s course of disease.Relative to early events,late events are more valuable.Moreover,the time interval between clinical events is irregular,making it difficult to use traditional models for analysis.Most methods treat all clinical events as equal intervals,thus cannot comprehensively utilize the patient’s long-term and short-term disease information.Third,there are many types of clinical events,and various event relationships need to be better integrated.Many studies model a single event based on expert knowledge,and cannot use the implicit relationships between multiple events,which still needs furtherimprovement.In response to the above problems,we carried out research on both medical concept representation and clinical diagnosis prediction,and established a medical concept representation method based on deep time-controlled graph convolution to solve the problem of semantic loss and matrix sparseness,make full use of time information,semantic information and event relationships to predict future events.Our main innovations are as follows:(1)Aiming at the problems of semantic loss and matrix sparseness,we established a fine-grained medical concept representation form,which can capture the character-level semantic information implied in the medical concept,and can also solve the matrix sparseness problem.First,we divided medical concepts at a fine granularity,analyzed the internal structure of the concepts,and captured character-level medical concept information.Second,we calculated the medical semantic similarity,and then established a character-level shared representation for word vectors,thus retaining rich medical professional semantic information.Third,we conducted experiments on public data sets and proved that the medical concept representation has clustering features,which can establish a good foundation for subsequent prediction work.(2)We proposed an improved long short term memory network(LSTM)that can model different time intervals.The model can comprehensively use early events and late events to predict disease diagnosis.First,we extracted the patient’s historical clinical events and constructed a complete patient course vector according to chronological order.Second,we added a time control unit to the LSTM structure,which gives different weights to events at different time intervals,so the model can handle variable-length interval events.Third,we conducted a large number of comparative experiments on real data sets,and the results showed that the time control unit could significantly improve the accuracy of the prediction model and is highly competitive.(3)We constructed a clinical prediction model for multi-dimensional events—deep time-controlled graph convolution network to improve the accuracy of clinical diagnosis.First,we comprehensively used a variety of clinical events and generated heterogeneous graphs for various event relationships according to the multidimensional and heterogeneous characteristics of the events.Second,we performed convolution operations on the constructed heterogeneous graphs to establish fusion types expression of event relationships.Third,we combine the time control unit with graph neural structure to construct deep time-controlled graph convolution network to process multi-dimensional data with time information.We conducted extensive experiments on the large multi-parameter intensive care public database MIMICIII to objectively evaluate the model performance.The results proved that the deep time-controlled graph convolution model achieved higher accuracy in the field of clinical event prediction and is of great significance to medical information research. |