西安电子科技大学学报 ›› 2024, Vol. 51 ›› Issue (4): 170-179.doi: 10.19665/j.issn1001-2400.20240201

• 计算机科学与技术 & 网络空间安全 • 上一篇    下一篇

面向动态博弈的k-匿名隐私保护数据共享方案

曹来成(), 后杨宁(), 冯涛(), 郭显()   

  1. 兰州理工大学 计算机与通信学院,甘肃 兰州 730050
  • 收稿日期:2023-11-02 出版日期:2024-08-20 发布日期:2024-03-08
  • 作者简介:曹来成(1965—),男,教授,E-mail:caolch@lut.edu.cn
    后杨宁(1997—),女,兰州理工大学硕士研究生,E-mail:1467957375@qq.com
    冯涛(1970—),男,研究员,E-mail:fengt@lut.edu.cn
    郭显(1971—),男,教授,E-mail:iamxg@163.com
  • 基金资助:
    国家自然科学基金(61562059);国家自然科学基金(62162039);甘肃省自然科学基金(20JR5RA467)

K-anonymity privacy-preserving data sharing for a dynamic game scheme

CAO Laicheng(), HOU Yangning(), FENG Tao(), GUO Xian()   

  1. School of Computer and Communication,Lanzhou University of Technology,Lanzhou 730050,China
  • Received:2023-11-02 Online:2024-08-20 Published:2024-03-08

摘要:

针对训练深度学习模型时,存在缺少大量带标签训练数据和数据隐私泄露等问题,提出了一个面向动态博弈的k-匿名隐私保护数据共享(KPDSDG)方案。首先,引入动态博弈策略设计了最优数据k-匿名方案,在保护数据隐私的同时实现了数据的安全共享。其次,提出了一个数据匿名化评估框架,以匿名数据的可用性、隐私性和信息丢失评估数据匿名化方案,可以进一步提高数据的隐私性和可用性,以降低重新识别的风险。最后,采用条件生成对抗网络生成数据,解决了模型训练缺少大量带标签样本的问题。安全性分析显示,整个共享过程能够保证数据拥有者隐私信息不被泄露。同时实验表明,该方案隐私化后生成的数据训练的模型准确率高于其他方案,最优情况高出8.83%。且与基于原始数据所训练的模型准确率基本一致,最优情况仅相差0.34%。同时该方案具有更低的计算开销。因此该方案同时满足了数据匿名、数据增广和数据安全共享。

关键词: 条件生成对抗网络, 数据匿名化, 隐私评估, 隐私保护, 数据共享

Abstract:

Aiming for fact that the deep trained learning model has some problems,such as lack of a large amount of labeled training data and data privacy leakage,a k-anonymity privacy-preserving data sharing for the dynamic game(KPDSDG) scheme is proposed.First,by using the dynamic game strategy,the optimal data k-anonymization scheme is designed,which achieves secure data sharing while protecting data privacy.Second,a data anonymization evaluation framework is proposed to evaluate data anonymization schemes based on the availability,privacy,and information loss of anonymous data,which can further improve the privacy and availability of data and reduce the risk of reidentification.Finally,owing to adopting the conditional generative adversarial network to generate data,the problem that model training lacks a large amount of labeled training samples is solved.The security analysis shows that the entire sharing process can ensure that the privacy information of the data owner is not leaked.Meanwhile,experiment shows that the accuracy of the model trained on the data generated after privacy in this scheme is higher than that of other schemes,with the optimal situation being 8.83% higher,that the accuracy of the proposed solution in this paper is basically consistent with the accuracy of the model trained based on raw data,with a difference of only 0.34% in the optimal situation and that the scheme has a lower computing cost.Therefore,the scheme satisfies data anonymity,data augmentation,and data security sharing simultaneously.

Key words: conditional generative adversarial network, data anonymity, privacy evaluation, privacy-preserving, data sharing

中图分类号: 

  • TP309.2