J4 ›› 2014, Vol. 41 ›› Issue (5): 148-154+160.doi: 10.3969/j.issn.1001-2400.2014.05.025

• Original Articles • Previous Articles     Next Articles

Enhancing user privacy for personalized web search in big data

KANG Haiyan1;XIONG Li2   

  1. (1. School of Information Management, Beijing Information Science and Technology University, Beijing  100192, China;
    2. Department of Mathcs, Emory University, Atlanta, USA  30322)
  • Received:2013-05-08 Online:2014-10-20 Published:2014-11-27
  • Contact: KANG Haiyan E-mail:kanghaiyan@126.com

Abstract:

To solve the contradiction between leaking user privacy potentially existing in large data and enhancing the performance of personalized information retrieval, an anonyminzation method based on the differential privacy with p-link technology is proposed. First, we generalize quasi identifiers and add noise to meet the differential privacy requirements. This method can maximize the query accuracy of statistical database, while minimizing the probability of identification records. Secondly, they cluster to meet the p-link equivalence group by the similarity between user profiles, and we calculate weights and equivalence group centroid. Finally, we release anonymized data. Experimental results demonstrate that the method of integrating the characteristics of differential privacy and p-link does not change users' interests, and that it can protect users' privacy, but also ensures a personalized retrieval performance.

Key words: user profile, anonymization, privacy protection, information security, differential privacy