西安电子科技大学学报 ›› 2023, Vol. 50 ›› Issue (4): 111-120.doi: 10.19665/j.issn1001-2400.2023.04.011

• 网络空间安全专栏 • 上一篇    下一篇

自适应裁剪的差分隐私联邦学习框架

王方伟(),谢美云(),李青茹(),王长广()   

  1. 河北师范大学 计算机与网络空间安全学院 河北省网络与信息安全重点实验室,河北 石家庄 050024
  • 收稿日期:2023-01-08 出版日期:2023-08-20 发布日期:2023-10-17
  • 作者简介:王方伟(1976—),男,教授,E-mail:fw_wang@hebtu.edu.cn;|谢美云(1998—),女,河北师范大学硕士研究生,E-mail:xmy-123@stu.hebtu.edu.cn;|李青茹(1971—),女,副教授,E-mail:qingruli@hebtu.edu.cn
  • 基金资助:
    国家自然科学基金(61572170);河北省自然科学基金(F2021205004);河北省教育厅重点基金(ZD2021062)

Differentially private federated learning framework with adaptive clipping

WANG Fangwei(),XIE Meiyun(),LI Qingru(),WANG Changguang()   

  1. Hebei Province Key Laboratory of Network and Information Security,College of Computer and Cyber Security,Hebei Normal University,Shijiazhuang 050024,China
  • Received:2023-01-08 Online:2023-08-20 Published:2023-10-17

摘要:

联邦学习允许参与训练的各方在不共享自己数据的前提下,实现协同建模,其数据隔离策略在一定程度上保障了用户数据的隐私安全,有效缓解了数据孤岛问题。然而,联邦学习的训练过程涉及参与者和服务器之间大量的参数交互,仍存在隐私泄露风险。为解决联邦学习数据传输过程中的隐私保护问题,提出了一种基于自适应裁剪的差分隐私联邦学习ADP_FL框架。在该框架中,各参与方使用自己的数据在本地执行多次迭代来训练模型,在每个迭代中自适应地选取裁剪阈值对梯度进行裁剪,将梯度限制在一个合理范围内;仅向上传的模型参数中添加动态的高斯噪声,以掩藏各参与者的贡献,服务器聚合接收到的噪声参数来更新全局模型。自适应梯度裁剪策略不仅可以实现对梯度的合理校准,同时裁剪阈值作为敏感度当中的一项参数,通过动态改变敏感度来控制着添加的噪声规模。理论分析和实验表明,所提出的框架在强隐私约束下,仍能够实现良好的模型精度。

关键词: 联邦学习, 差分隐私, 隐私泄漏, 自适应裁剪

Abstract:

Federation learning allows the parties involved in training to achieve collaborative modeling without sharing their own data.Its data isolation strategy safeguards the privacy and security of user data to a certain extent and effectively alleviates the problem of data silos.However,the training process of federation learning involves a large number of parameter interactions among the participants and the server,and there is still a risk of privacy disclosure.So a differentially private federated learning framework ADP_FL based on adaptive cropping is proposed to address the privacy protection problem during data transmission.In this framework,each participant uses its own data to train the model by performing multiple iterations locally.The gradient is trimmed by adaptively selecting the trimming threshold in each iteration in order to limit the gradient to a reasonable range.Only dynamic Gaussian noise is added to the uploaded model parameters to mask the contribution of each participant.The server aggregates the received noise parameters to update the global model.The adaptive gradient clipping strategy can not only achieve a reasonable calibration of the gradient,but also control the noise scale by dynamically changing the sensitivity while considering the clipping threshold as a parameter in the sensitivity.The results of theoretical analysis and experiments show that the proposed framework can still achieve a great model accuracy under strong privacy constraints.

Key words: federated learning, differential privacy, privacy disclosure, adaptive clipping

中图分类号: 

  • TP391