| At present,the world is in the midst of a new round of scientific and technological revolution and the breakthrough of the industrial revolution.The information technology represented by the Internet and human production and life are deeply integrated,but the rapid development of Internet technology in the world.At the same time of development,human society has also been entangled in the era of "big data".The leakage of personal privacy brought about by various data releases is also increasingly plaguing people's lives.What effective privacy protection mechanism is used to publish Data has become a research hotspot for scholars.In this thesis,differential privacy protection mechanism is used to publish real-time data streams.Differential privacy is one of the currently effective privacy protection mechanisms and has been widely used.A variety of algorithms have been used to generate static histograms that satisfy differential privacy,but there are few histogram publishing methods for real-time data stream environments,and there is no good balance of noise errors and data availability.The reason is that(1)the data stream has high real-time performance,because it arrives in real time,so it requires real-time privacy protection for data release;(2)the data flow continuity is strong,it is continuously arrived,the arrival time and rate are unknown.Therefore,it is required to deal with the continuous processing model to publish data;(3)the data flow is very large,so it will bring about budget allocation and histogram counting on privacy protection.Based on this,this paper first proposes a differential stream histogram publishing method(DDHP)for data stream,which is based on the sliding window model to process the new arrival data in real time,and uses the distance measure method to measure the two adjacent time.The data similarity of the stamps to dynamically allocate the privacy budget.The optimal measure is selected by comparing the effectiveness of the application of L1,cosine and Mahalanobis on the real data set.DDHP adopts the series-based privacy budget allocation strategy to reasonably allocate the privacy budget.The series method is actually an improvement of the dichotomy.This strategy utilizes the sequence combination nature of differential privacy protection,and uses the concept of series to establish a total amount of differential privacy budget.ε infinitesimal,non-uniform segmentation,to ensure that the increase of added noise slows down,improve the availability of published data,according to the experimental results.the DDHP algorithm is proved effectivly and feasibly,but when it is necessary to process large amounts of data to increase the window value,DDHP algorithm's rate of error growth will also increase rapidly,because the seriesbased privacy budget allocation mechanism does not apply to large windows.Therefore,this paper proposes a new privacy budget allocation strategy BA(Budget Absorption)mechanism for dynamically allocating privacy budgets,avoiding the premature exhaustion or surplus of privacy budgets,and reducing the algorithm's errors while improving data availability.After experimental verification,the program is feasible. |