Font Size: a A A

Research On Privacy Protection Method For TCM Clinical Big Data Sharing

Posted on:2023-10-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:J YuFull Text:PDF
GTID:1524306611470944Subject:TCM History and Literature
Abstract/Summary:PDF Full Text Request
Background:With the development of information technology,the massive medical and personal medical record information data of the hospital has gradually changed from traditional paper storage to electronic storage.The electronic storage data of medical data include sensitive data of patients,such as health data,genetic history,mobile phone number,ID number,and sensitive information such as important prescription of Chinese medicine.These sensitive data not only need to record and store the patient’s medical information,but also can be opened to the third party(such as universities,scientific research institutions,research institutions,etc.),which will collect,store,manage,analyze,mine and transmit these data on the network.In the process of personal medical privacy data mining,analysis and network sharing and opening,there are great risks and problems of personal privacy data leakage.In view of these risks and problems,we need to have a set of privacy protection scheme for TCM clinical big data.Objective:In the transmission and sharing of electronic information of TCM clinical big data,the algorithm of deep learning and training model is designed to dynamically identify sensitive personal information in TCM clinical medical data,and improve the recognition efficiency and accuracy.Design a data encryption algorithm suitable for the big data scene of traditional Chinese medicine,and combine the privacy protection scheme of attribute encryption and structure authorization to effectively protect the privacy of clinical sensitive information of traditional Chinese medicine,so as to avoid the leakage of sensitive information.The algorithm based on homomorphic encryption and zero knowledge proof is designed to provide the sharing access mechanism of TCM clinical big data in various TCM medical institutions,so as to meet the security and traceability of medical data and transactions.This study provides a secure and efficient data source for each link of TCM clinical big data sharing,and provides services and support for the efficient and secure transmission of massive personal privacy data under TCM big data platform.Methods:In each link of TCM clinical big data sharing,using the method of modern information technology,this paper analyzes the privacy protection mechanism of sensitive data.The following research methods are adopted:1)Structured medical text data following standard syntax and semantics,Preprocess the medical data such as data cleaning,data conversion,data description and feature extraction.Use the natural language processing method and the cbow model of word2vec to train the large corpus of traditional Chinese medicine medical text format to obtain the corresponding word vector.Add the word vector with high similarity to the medical sensitive information dictionary to model the time-series text data,The algorithm based on deep learning and Bi lstlm training model is designed;At the same time,CRF conditional random field statistical probability decision-making model is introduced.Based on implicit state modeling,the relationship between state sequences is studied to obtain the best observation sequence,which is used for probability decision-making and statistics of sensitive word sequence marking in medical text.The output layer of Bi LSTM is connected to the CRF layer to make sentence level label prediction,so as to improve the accuracy of sensitive word information prediction in traditional Chinese medicine medical text data.2)Study the account password management scheme,login authentication management scheme and authority authorization management scheme of traditional Chinese medicine big data.The security management of account,the security management of password,the process of password encryption algorithm,password attack and anti attack are designed and analyzed.By comparing the password encryption algorithm pbkdf2 algorithm with aes128 algorithm,the encryption algorithm suitable for the account order of traditional Chinese medicine big data platform is selected.According to the structured characteristics of clinical medical text data of TCM big data,the medical text data is formally described and analyzed in language,the patient’s clinical medical data is divided into sensitive data vector and non sensitive data vector,the attribute characteristics of sensitive data are extracted,and the inner product encryption algorithm based on attribute sensitive data and specific ciphertext hash is designed,At the same time,according to the characteristics of medical application scenarios and medical data,different parameters and data volume are selected to study and compare the algorithm.3)The structured medical data is composed of redundant structured information and leaf node medical data information.The method of extracting the trunk structure tree is used to remove the redundancy of the structured data,and the matrix of the trunk structure tree is transformed to generate the storage matrix corresponding to the trunk structure tree;The content of leaf nodes is interval coded to facilitate data query,so that the structure information and node content are stored separately.With the cooperation of cloud and fog nodes,a medical data protection scheme based on attribute encryption and XML structure authorization is designed.The shared medical documents are authorized through the authorization matrix,so as to achieve fine-grained access to medical data.4)Blockchain homomorphic encryption technology is characterized by directly operating on ciphertext and then encrypting on plaintext.The results are the same.Without a secret key,only the encryption results are stored and transmitted without obtaining specific data information,so as to achieve the effect of supporting the processing of encrypted data without disclosing any original information.Based on this characteristic,a homomorphic encryption algorithm based on Paillier addition and a range based zero knowledge proof algorithm are designed and proposed,which are used for the sharing and access of medical data of various traditional Chinese medicine data platforms,while ensuring the security and traceability of medical data;At the same time,for high-frequency and small medical data access,a security and privacy protection scheme of off chain channel is designed to ensure the security of transactions between both parties under the chain of medical blockchain,so as to improve the access efficiency of medical data.Results:1)The medical data of some departments in a hospital of integrated traditional Chinese and Western medicine in a city of China is mined.By preprocessing the medical text information of big data of traditional Chinese medicine,the words in natural language are transformed into dense vector of computer through word2vec,and the medical sensitive word vector and dictionary library are established.The sensitive data of medical features are modeled based on Bi lstlm+CRF,Gender name ID number,mobile phone number,name and address of these 5000 medical data were tested in the medical sensitive entity.2)Designs and implements a data desensitization algorithm based on attribute inner product encryption,which divides the sensitive information in the batch of clinical big data of traditional Chinese medicine into different length data granularity,and processes the inner product with the hash of specific ciphertext.In the face of the scene of massive data encryption of TCM big data platform,compared with the traditional hash encryption algorithm,this algorithm has flexible data granularity,strategy and efficient performance,which is suitable for the desensitization of massive TCM clinical data.3)Design and implement a clinical medical data protection scheme based on structure authorization and attribute encryption.With the help of cloud and fog nodes,fine-grained access control is implemented based on attribute encryption algorithm.According to the semi-structured characteristics of medical data,the method of extracting the backbone structure tree is used to remove the structural redundancy,and the start stop interval coding is used to represent the node information,which jointly completes the data privacy protection.Security proof and results show that the data protection scheme based on attribute encryption and structure authorization has high security and low storage performance,which is suitable for high-level security protection of TCM clinical medical data in the sharing environment of cloud storage background.4)Designs and implements a blockchain security and privacy protection scheme based on Paillier addition homomorphic encryption and ring signature based range zero knowledge proof.Solve the problem of TCM clinical medical big data,realize data sharing,get through the process of each link,improve the transparency and traceability of medical data,meet the demands of patients’ medical data sharing and access between various TCM medical platforms,and ensure the security and privacy of personal medical sensitive data.In the scenario of blockchain storing clinical medical data,all TCM hospitals and medical insurance centers can share and access patients’ medical data.At the same time,this scheme and algorithm have good access performance.Conclusion:In this thesis,the corresponding algorithms and schemes are designed to identify,encrypt,share and access sensitive medical data in each link of TCM clinical big data sharing process,so as to provide high-performance and high-level privacy security for TCM clinical big data sharing.
Keywords/Search Tags:Big data of traditional Chinese Medicine, Deep learning, Data desensitization, Access Authorization, Privacy Protection, Medical Blockchain
PDF Full Text Request
Related items