西安电子科技大学学报 ›› 2023, Vol. 50 ›› Issue (6): 219-236.doi: 10.19665/j.issn1001-2400.20230207

• 网络空间安全 • 上一篇    下一篇

混洗差分隐私保护的度分布直方图发布算法

丁红发1,2,3(),傅培旺1(),彭长根2(),龙士工2(),吴宁博1()   

  1. 1.贵州财经大学 信息学院 贵州省大数据统计分析重点实验室,贵州 贵阳 550025
    2.贵州大学 公共大数据国家重点实验室,贵州 贵阳 550025
    3.贵安新区科创产业发展有限公司,贵州 贵阳 550025
  • 收稿日期:2022-11-01 出版日期:2023-12-20 发布日期:2024-01-22
  • 作者简介:丁红发(1988—),男,副教授,E-mail:hongfa.ding@foxmail.com;|傅培旺(2000—),男,贵州财经大学硕士研究生,E-mail:1145646736@qq.com;|彭长根(1963—),男,教授,E-mail:peng_stud@163.com;|龙士工(1965—),男,教授,E-mail:526796467@qq.com;|吴宁博(1989—),男,副教授,E-mail:hn_dragon@163.com
  • 基金资助:
    国家自然科学基金(62002080);贵州财经大学校级项目(2021KYYB14)

Histogram publishing algorithm for degree distribution via shuffled differential privacy

DING Hongfa1,2,3(),FU Peiwang1(),PENG Changgen2(),LONG Shigong2(),WU Ningbo1()   

  1. 1. College of Information,Guizhou Key Laboratory of Big Data Statistical Analysis,Guizhou University of Finance and Economics,Guiyang 550025,China
    2. State Key Laboratory of Pubic Big Data,Guizhou University,Guiyang 550025,China
    3. Guian Science and Technology Industry Development Co.,Ltd.,Guiyang 550025,China
  • Received:2022-11-01 Online:2023-12-20 Published:2024-01-22

摘要:

当前,基于中心化或本地差分隐私的图数据度分布直方图发布算法无法有效平衡发布数据的隐私保护程度及其可用性,且不能有效保护用户的身份隐私。针对该问题,在编码-混洗-分析框架下提出一种混洗差分隐私保护的度分布直方图发布算法。首先,设计混洗差分隐私图数据度分布直方图隐私保护框架,采取交互式用户分组、混洗器及方波本地加噪扰动机制降低编码器对分布式用户本地差分隐私加噪的噪声影响,并利用极大似然估计在分析器端对加噪后的度分布直方图进行数据矫正,从而提高数据效用;其次,提出具体的分布式用户分组、混洗差分隐私加噪和数据矫正算法,并证明其满足(ε,σ)-混洗差分隐私。实验和对比结果表明,所提算法能保护分布式用户隐私,在L1距离、H距离和MSE多个指标度量下的数据效用比已有算法提升了26%以上,且具有较低的时间开销和稳定的数据效用表现,适用不同规模的图数据度分布直方图发布共享应用。

关键词: 隐私保护技术, 图结构, 混洗差分隐私, 度分布直方图发布, 数据效用

Abstract:

At present,the existing histogram publishing algorithms based on centralized or local differential privacy for graph data degree distribution can neither balance the privacy and utility of published data,nor preserve the identity privacy of end users.To solve this problem,a histogram publishing algorithm for degree distribution via shuffled differential privacy(SDP) is proposed under the framework of Encode-Shuffle-Analyze.First,a privacy preserving framework for histogram publishing of degree distribution is designed based on shuffled differential privacy.In this framework,the noisy impact that the encoder brings to distributed users is reduced by employing interactive user grouping,the shuffler and the square wave noise mechanism,while adding noise via local differential privacy.The noisy histogram of degree distribution is reconciled via the maximum likelihood estimation at the analyzer end,thus improving the utility of published data.Second,specific algorithms are proposed for concreting distributed user grouping,adding shuffled differential privacy noise and reconciling the noisy data,respectively.Furthermore,it is proved that these algorithms meet the requirement of(ε,σ)-SDP.Experiments and comparisons illustrate that the proposed algorithms can preserve the privacy of distributed users,and that the data utility is improved more than 26% with metrics in terms of L1 distance,H distance and MSE in comparison with the existing related algorithms.The proposed algorithms also perform with a low overhead and stable data utility,and are suitable for publishing and sharing the histogram of degree distribution for different scales of graph data.

Key words: privacy-preserving techniques, graph structures, shuffled differential privacy, degree distribution histogram publishing, data utility

中图分类号: 

  • TN918