PRIVACY-PRESERVING SELECTIVE AGGREGATION OF ONLINE USER BEHAVIOR DATA

ABSTRACT
Tons of online user behavior data are being generated every day on the booming and ubiquitous Internet. Growing effortshave been devoted to mining the abundant behavior data to extract valuable information for research purposes or business interests.However, online users’ privacy is thus under the risk of being exposed to third-parties. The last decade has witnessed a body ofresearch works trying to perform data aggregation in a privacy-preserving way. Most of existing methods guarantee strong privacyprotection yet at the cost of very limited aggregation operations, such as allowing only summation, which hardly satisfies the need ofbehavior analysis. In this paper, we propose a scheme PPSA, which encrypts users’ sensitive data to prevent privacy disclosure fromboth outside analysts and the aggregation service provider, and fully supports selective aggregate functions for online user behavioranalysis while guaranteeing differential privacy. We have implemented our method and evaluated its performance using a trace-drivenevaluation based on a real online behavior dataset. Experiment results show that our scheme effectively supports both overallaggregate queries and various selective aggregate queries with acceptable computation and communication overheads.
EXITING SYSTEM:
Privacy-preserving aggregation on sensitive user data hasraised much attention recently, including health care data , time-series data , wireless sensor networkdata , and online behavior data for analysis andadvertising . In general, there are two types ofsystems in previous work.In a centralized system, all the user data are stored on theserver. It is important that users encrypt or encode theirdata before sending them to the server. The server holds theencrypted data, but it can only compute answers to queriesobliviously,. However, these proposals havedifferent goals than our system and do not support selectiveaggregation. Moreover, they do not guarantee differentialprivacy. Homomorphic encryption is a common method toachieve aggregation of encrypted data without decryption,such. Chen et al. used an orderpreservinghash-based function to encode both data andqueries instead. But they do not have the same goal asus and cannot evaluate selective aggregation. Li et al. [43]proposed a system that processes range queries, which yetdoes not compute aggregation and assumes analysts to betrusted. On the contrary, PPSA combines differential privacyand homomorphic encryption, and is able to selectivelyaggregate encrypted user data.
PROPOSED SYSTEM:
To address these challenges, we design a scheme PPSA(Privacy-Preserving Selective Aggregation). In general, ourcontributions can be summarized as follows:_ We present the first scheme PPSA that allowsprivacy-preserving selective aggregation on userdata, which plays a critical role in online user behavioranalysis._ We combine homomorphic encryption and differentialprivacy mechanism to protect users’ sensitiveinformation from both analysts and aggregation serviceproviders, and protect individuals’ privacy frombeing inferred. We prove that differential privacycan be achieved by adding two Geometric variables,which is computed via homomorphic encryption.Furthermore, we present a privacy analysis of PPSA._ We extend PPSA to two more scenarios to fullysupport more complex selective aggregation of userdata. We utilize a calculation to evaluate aggregationselected by multiple boolean attributes. We design away of oblivious comparison between two integers,and utilize it to evaluate aggregation selected by anumeric attribute.We implement PPSA and do a trace-driven evaluationbased on an online behavior dataset. Evaluationresults show that our scheme effectively supportsvarious selective aggregate queries with high accuracyand acceptable computation and communicationoverheads.
CONCLUSION
In this paper, we have described the challenges of makingonline user data aggregation while preserving users’ privacy.Based on BGN homomorphic cryptosystem, we havedesigned the first system that is able to securely and selectivelyaggregate user data, making it practical in realisticdata analytics. It guarantees strong privacy preservation byutilizing differential privacy mechanism to protect individuals’privacy. We have presented PPSA to evaluate aggregationselected by one boolean attribute, and extended it toaggregation selected by multiple boolean attributes and byone numeric attribute. Extensive experiments have shownthat PPSA supports various selective aggregate queries withacceptable overhead and high accuracy.
REFERENCES
[1] R. E. Bucklin and C. Sismeiro, “Click here for internet insight:Advances in clickstream data analysis in marketing,” Journal ofInteractive Marketing, vol. 23, no. 1, pp. 35–48, 2009.
[2] R. Bose, “Advanced analytics: opportunities and challenges,” IndustrialManagement & Data Systems, vol. 109, no. 2, pp. 155–172,2009.
[3] H. Chen, R. H. Chiang, and V. C. Storey, “Business intelligenceand analytics: From big data to big impact.” MIS quarterly, vol. 36,no. 4, pp. 1165–1188, 2012.
[4] I. E. Akkus, R. Chen, M. Hardt, P. Francis, and J. Gehrke, “Nontrackingweb analytics,” in Proceedings of the ACM Conference onComputer and communications security (CCS), 2012, pp. 687–698.
[5] F. Roesner, T. Kohno, and D. Wetherall, “Detecting and defendingagainst third-party tracking on the web,” in Proceedings of the 9thUSENIX conference on Networked Systems Design and Implementation,2012.
[6] Directive 2009/136/ec of the european parliament and ofthe council. [Online]. Available: http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=OJ:L:2009:337:0011:0036:en:PDF
[7] Web tracking protection. [Online]. Available: http://www.w3.org/Submission/web-tracking-protection/
[8] V. Rastogi and S. Nath, “Differentially private aggregation ofdistributed time-series with transformation and encryption,” inProceedings of the ACM International Conference on Management ofData (SIGMOD), 2010, pp. 735–746.
[9] B. Applebaum, H. Ringberg, M. J. Freedman, M. Caesar, andJ. Rexford, “Collaborative, privacy-preserving data aggregationat scale,” in Proceedings of the 10th Privacy Enhancing TechnologiesSymposium (PETS), 2010, pp. 56–74