Font Size: a A A

Research On Anonymity Models And Algorithms For Microdata Privacy Preserving With Multi-Sensitive Attributes

Posted on:2014-10-17Degree:MasterType:Thesis
Country:ChinaCandidate:F W LuoFull Text:PDF
GTID:2268330425452455Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
There are plenty of data relating to individuals in the Internet age called microdata. Microdata have great significance on trend analysis, disease prediction and operating decision. Therefore, organizations may release private data for the purposes of facilitating data analysis and research. For example, medical records of patients may be released by a hospital to aid the medical study. However, publishing microdata may threat individual’s privacy. Thus, privacy preserving on data publishing becomes a hot topic in data mining. So far, a large number of research achievements have been realized on microdata with single sensitive attribute. But, there are lots of mircodata with multiple sensitive attributes in real life. Anonymity models and algorithms oriented to single sensitive attribute can not be directly used to microdata with multiple sensitive attributes. Therefore, research on anonymity models and algorithms for mcrodata with multiple sensitive attributes has great significance.In research of privacy preserving of microdata with multiple sensitive attributes, we need to establish suitable models to ensure the safety of data publishing. We still need to propose algorithms to realize the models. In this paper, we concentrate on anonymity models and algorithms for microdate with multiple sensitive attributes. The main contributions list below:(1) A (l, m)-diversity model of resisting the associated attack on multi-sensitive attributes is proposed. Existing anonymity models for publishing microdata do not capture the relationship of sensitive attributes and cannot resist the attack based on the relation of sensitive attributes. To address the problem, the paper proposes a (l, m)-diversity model to resist the associated attack on sensitive attributes. The model requests that in each equivalence class, the diversity of each sensitive attribute is at least l and when one sensitive values are deleted, the rest of sensitive values still satisfy (l-1, m)-diversity. The paper also proposes two algorithms to implement the (l,m)-diversity model——BottomUp algorithm and TopDown algorithm. Experimental results show that the proposed algorithms can implement the (l, m)-diversity model and preserve privacy on publishing microdata with multi-sensitive attributes effectively.(2) SLOMS, a privacy preserving data publishing method for multiple sensitive attributes microdata, is proposed. Multi-dimension bucketization is a typical method to anonymize multiple sensitive attributes. However, the methods lead to low data utility when microdata have more sensitive attributes. In addition, the methods do not generalize quasi-identifiers, which make the anonymous data easy to suffer from linked attack. To overcome these drawbacks, the paper proposes a SLOMS method. The method vertically partitions the multiple sensitive attributes into several tables and bucketizes each sensitive attribute table to implement l-diversity. At the same time, it generalizes the quasi-identifiers to implement k-anonymity. The paper also proposes a MSB-KACA algorithm to anonymize microdata with multiple sensitive attributes by SLOMS. Experiments show that SLOMS can generate anonymized tables with less suppression ratio and distortion comparing with generalization and MSB.(3) Bucketization permutation anonymization for microdata with multiple sensitive attributes is proposed. Anatomy is an excellent technology for anonymization. However, anatomy does not deal with quasi-identifier attributes lead to safety risk. To overcome these drawbacks, the paper proposes bucketization permutation method. Bucketization permutation combines multi-dimension bucketization with permutation, which can anonymize both sensitive attributes and quasi-identifier attributes. Two algorithms are proposed to implements bucketization permutation, whose names are NMBPA and CDMBPA. Experiments on real data are conducted, showing that our method can produce high quality data with low suppression ratios, offering strong privacy guarantees.
Keywords/Search Tags:multi-sensitive attributes, privacy preserving, (l,m)-diversity, SLOMS, bucketization permutation
PDF Full Text Request
Related items