Font Size: a A A

Research On Trajectory Sequential Data Publishing For Differential Privacy Protection

Posted on:2022-10-02Degree:MasterType:Thesis
Country:ChinaCandidate:S P ZhaoFull Text:PDF
GTID:2518306539463004Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the widespread availability of location-based devices,data administrators,such as location-based service providers,have collected a large amount of user location data and movement trajectory data.These data may contain users’ privacy information so publishing data directly and sharing with third parties may lead to the personal privacy disclosure.Differential privacy can prevent background attack and protect data privacy effectively.This paper aims to apply differential privacy techniques on trajectory sequential data to generate and publish a trajectory data set with high data utility.Aiming at the trajectory sequential data,this paper studies the differential privacy protection method of data publishing,defines the problem and proposes solutions based on probability distribution and clustering.The specific research contents are as follows:(1)The differential privacy trajectory data publishing problem is defined.Trajectory sequential data is introduced.This kind of sequence data is with the characteristics of high dimensional and sparse.This paper summarizes the current mainstream attack model,analyzes the insufficiency of current privacy protection methods,applies differential privacy noninteractive framework to design the data protection model,improves the protection level and data utility.At the same time,six evaluation methods are adopted to evaluate the data set published by the model from the aspects of statistics,frequency pattern and Euclidean geometry.(2)A trajectory sequential data publishing model based on probability distribution for differential privacy protection is proposed.Firstly,based on minimum description length this model removes redundant trajectory points that makes trajectory simplified and representative trajectory points extracted.Then,according to the proportion of trajectory points in the cell,a multi-level grid structure which satisfies differential privacy protection is generated adaptively and iteratively,and the trajectory points are mapped into the grid.After that,to save the important information of input trajectories,the probability distribution model of the trajectory starting and ending points pairs,the transition probability distribution model and the trajectory length model were extracted and established which all satisfy the differential privacy protection.Then a new trajectory data set are generated.Finally,the integrated trajectory sequential data set is processed by the post-processing method based on direction and trajectory point density.Both on real data set and simulated data set,five evaluation methods are used to verify that the data set published by the model which is proposed in this paper has higher data utility from the aspects of statistics and frequent patterns.(3)A trajectory sequential data publishing model based on clustering for differential privacy protection is proposed.Firstly,the trajectory points on each timestamp are aggregated by AP clustering.Taking the aggregation results as the standard,combined with Hausdorff distance,the candidate partitions were calculated and filtered.Then,the exponential mechanism was used to select the final aggregation partition under differential privacy protection.Then,Laplace noise is added to the aggregated trajectory data set.Finally,the integrated trajectory sequence data set is published after consistency post-processing.Through algorithm analysis and experimental comparison,it can be verified that the trajectory sequence data set published by this model has higher data utility.In this paper,the models proposed are evaluated from multiple perspectives on multiple real data sets and simulated data sets.Compared with the up-to-date models,it verifies that models proposed by this paper are more feasible and are worth researching.
Keywords/Search Tags:Trajectory sequential data, Data publishing, Differential privacy, Probability distribution, Clustering
PDF Full Text Request
Related items