Research On Privacy Protection Of Chinese Medical Text Generation Task Based On Deep Learning

Posted on:2024-05-05

Degree:Master

Type:Thesis

Country:China

Candidate:Y X Jie

Full Text:PDF

GTID:2544306932455384

Subject:Cyberspace security

Abstract/Summary:

PDF Full Text Request

The Chinese medical text generation task holds substantial importance in enhancing healthcare services and disseminating medical knowledge.However,it confronts distinct challenges compared with conventional text generation tasks.Firstly,the need for medical text data,compounded by stringent legal and regulatory constraints,creates a significant hurdle in training superior language models.Secondly,the generated output must ensure interpretability and accuracy while not compromising expressive capacity for privacy protection.Thirdly,privacy protection in medical text generation assumes higher criticality than routine tasks such as chat and translation.This is attributable to data sensitivity,legal and ethical considerations,and ease of filtering private information from medical text.This calls for a stricter approach during the training and inference stages to guard against potential malicious attackers.To address these issues,we present a novel solution.We begin with a pre-trained model trained from extensive Chinese corpora,followed by fine-tuning step using medical knowledge corpora.This process intends to augment the model’s expressive power in the medical domain.Moreover,our strategy utilizes multi-party secure computation to allow several participants to supply training data while preserving privacy.This significantly mitigates the obstacles posed by the scarcity of medical data.We present a comprehensive analysis of the privacy attack model in medical text generation tasks to address privacy protection issues,demonstrating its threat to privacy and security.We propose an advanced attack method for the model inversion attacks during the inference stage.At the training stage,we construct and implement a multiparty secure computation protocol for the Transformer-based model to ensure training confidentiality.We deploy Intel SGX to guarantee the integrity of the training process.As for the inference stage,we address a selective differential privacy optimizer and a selective differential privacy decoding algorithm for the Transformer-based model.This deters malicious attackers from accessing or inferencing private training data,concurrently ensuring the accuracy and interpretability of the generated outcomes.Given the ease of filtering private information from medical text,deploying selective differential privacy yields considerable benefits.Furthermore,we introduce a new metric-the"medical text generation scientific index" to assess the scientific and the accuracy of the generated medical text.We validated this index through rigorous experimentation,which substantiates the scientific robustness of our model.This thesis represents a comprehensive exploration of privacy issues on Chinese medical text generation tasks,providing novel solutions simultaneously.This accomplishment extends the theoretical comprehension of privacy protection issues in medical text generation tasks and provides practical,effective strategies and tools for privacy protection within this domain.

Keywords/Search Tags:

Natural Language Processing, Medical Text Generation, Differential Privacy, Multi-Party Secure Computation, Privacy-preserving Deep Learning

PDF Full Text Request

Related items

1	Research On Methods Of Privacy-Preserving For Medical Data Query Computation
2	Research On Privacy-preserving Computation Of Electronic Health Records
3	Researches On Privacy-preserving Combinatorial Optimization And Its Applications For Medical Data
4	Deep Learning With Differential Privacy And Applied To Medical Image Analysis
5	Research On The Privacy Protection Of Medical Information Based On Blockchain
6	Research On Privacy-preserving Machine Learning In Medical Environment
7	Research On Privacy-preserving Classification Methods For Medical Image
8	A Blockchain-based Scheme For Privacy-preserving And Secure Sharing Of Medical Data
9	Research On Application Of Secure-Aware And Privacy-Preserving Medical Data Sharing
10	A Study On Disease Prediction Model Based On Small Sample Medical Data And Its Privacy Preserving Technologies