Font Size: a A A

Study On Multicenter Collaborative Data Mining And Modeling Methods For Electronic Health Records

Posted on:2023-01-10Degree:DoctorType:Dissertation
Country:ChinaCandidate:J LiFull Text:PDF
GTID:1524306836954859Subject:Biomedical engineering
Abstract/Summary:PDF Full Text Request
In recent years,explosive growth has been shown in biomedical data.Especially with the advancement of hospital digitalization and the adoption of electronic health record systems,enormous clinical data has been accumulated across different medical institutions.This can largely support data-driven clinical studies and artificial intelligence applications in healthcare-related research.The traditional single-center data analysis is limited by the scale and complexity of the collected data,therefore cannot provide sufficient training samples for the development of data-driven machine learning models.Hence,multi-center collaborative data mining,which can leverage the medical data across multiple institutions and break down the ‘data silos’,has emerged as a solution.Through the cooperation between different medical institutions,multicenter collaborative data mining can integrate large-scale electronic health record data for downstream tasks,and further improve the quality of clinical decision support and healthcare services.However,existing multi-center collaborative modeling methods still have shortcomings in terms of privacy-preserving,scalability,and generalization ability.Therefore,this study proposed a series of multi-center collaborative data mining and modeling methods for electronic health records,which can be applied to overcome these challenges in clinical data mining scenarios.The proposed method can benefit artificial intelligence development in medicine and clinical decision support.The main contributions of this study include:A differentially private generative adversarial network was proposed to synthesize electronic health record in privacy-preserving manners.The proposed model can provide high-quality synthetic patient time-series with different data types for various application scenarios.Meanwhile,the differential privacy techniques adopted can mathematically guarantee the security of patients’ private information.Compared with existing synthetic data generation schemes,the proposed generative adversarial network can output synthetic electronic medical records with high fidelity,flexibility,utility and privacy.Therefore,the proposed method can provide an effective solution for privatepreserving data sharing in multi-center collaborative data mining scenarios.A multi-center collaborative model training framework was proposed for building random forest model in a distributed manner.By leveraging the federated learning techniques,the proposed model enabled multiple medical institutions to build random forest without sharing the original patient-level electronic health record data.The proposed method provided a solution for efficiently and privately building the multi-center machine learning models.This method can make full use of the large-scale electronic health records acquired from different medical institutions,and extend the machine learning model training from single-center to multi-center.A transfer learning approach was proposed to improve the multi-center model’s generalization ability in external medical institution.The proposed multi-center collaborative transfer learning model can provide accurate and reliable prediction results for the target domain with insufficient training samples or labeled data.The proposed transfer learning model can leverage the large-scale electronic health record in multiple source domains,while leveraging the available training samples in the target domain.The proposed model can improve the generalization ability of the multi-center model,guaranteeing its effective application in real clinical settings.This study proposed a series of multi-center collaborative data mining methods,which focused on overcoming the obstacles in real-world electronic health record data mining.This study aimed to overcome the key issues during the expansion of medical data mining from single to multiple medical institutions.The proposed methods can realize the value of medical data in real clinical practice and provide strong technical support for improving the quality of healthcare services.
Keywords/Search Tags:Electronic health record, Multi-center collaborative data mining, Machine learning, Medical data privacy preservation, Generalization ability
PDF Full Text Request
Related items