Electronic Medical Records(EMR)store detailed health information of patients and are composed of data in multiple modalities.Effectively use EMR data to transform practical medical application problems into artificial intelligence models is the key to promoting the informatization and intelligence of Chinese medical and health industry.Representation learning of EMR refers to extracting key information from high-dimensional and sparse patient raw data to generate low-dimensional and dense representations.In recent years,a large number of deep learning methods have emerged in the field of intelligent medical care,which solve many challenges in representation learning.However,most of these methods only use data from one modality,and at the same time,models based on multimodal data often ignore the hidden relationships between data from different modalities.In view of the current scarcity of representation learning for multimodal EMR data and the shortcomings of existing methods,this paper systematically carries out the following research work:1)We propose MAIN,a representation learning model based on a multimodal attention mechanism.The model fully considers the characteristics of different modal data,and the corresponding feature capture method is designed to generate the representation of each modal data.Then a module consisting of an inter-modal relation representation layer and a cross-modal attention layer is designed to fully extract and utilize inter-modal relations.Finally,a patient representation module is used to fuse data representations of multiple modalities and inter-modal relationship representations to generate visit representations,and use crossmodal attention weights to fuse patients’ visit sequences to generate patient representations for disease prediction tasks.Experimental results show that the model achieves better prediction performance than existing methods.2)The EMR data of each modality can be divided into multiple types.However,most of the existing methods represent the data of each modality as a whole and cannot capture the finer-grained inter-modality relationship between different types of data.Therefore,this paper proposes MEGAT,a representation learning framework based on multimodal EMR data relation graph.The framework consists of three modules,each of them can be implemented using a different approach.Among them,the feature extraction module is used to extract information from each type of data;graph structure-based inter-modal relationship capture module builds relationship graphs for all types of data,and uses graph neural networks to extract features;the patient representation module It is used to learn the patient’s visit sequence information and generate a patient representation.The experimental results show that the framework has excellent scalability and representation ability. |