西安电子科技大学学报 ›› 2021, Vol. 48 ›› Issue (2): 205-212.doi: 10.19665/j.issn1001-2400.2021.02.026

• 计算机科学与技术 • 上一篇    

互惠双向生成对抗网络用于跨模态行人重识别

魏梓钰1(),杨曦1(),王楠楠1(),杨东2(),高新波3()   

  1. 1.西安电子科技大学 通信工程学院,陕西 西安 710071
    2.西安空间无线电技术研究所,陕西 西安 710100
    3.重庆邮电大学 图像认知重庆市重点实验室,重庆 400065
  • 收稿日期:2020-06-12 修回日期:2020-01-06 出版日期:2021-04-20 发布日期:2021-04-28
  • 通讯作者: 杨曦
  • 作者简介:魏梓钰(1996—),女,西安电子科技大学博士研究生,E-mail: zywei_xd@stu.xidian.edu.cn|王楠楠(1986—),男,教授,E-mail: nnwang@xidian.edu.cn|杨东(1988—),男,高级工程师,E-mail: yangd504@126.com|高新波(1972—),男,教授,E-mail: gaoxb@cqupt.edu.cn,xbgao@mail.xidian.edu.cn
  • 基金资助:
    国家自然科学基金(61976166);国家自然科学基金(61772402);国家自然科学基金(61671339);国家自然科学基金(U1605252);国家自然科学基金(61772387);国家自然科学基金(61922066);国家自然科学基金(61876142);国家重点研发计划(2016QY01W0200);国家重点研发计划(2018AAA0103202);国家高层次人才专项支持计划(CS31117200001);陕西省创新能力支撑计划(2020KJXX-027);中央高校基本科研业务费(JB210115);陕西高校科协青年人才托举计划(20180104);陕西省重点研发计划(2021GY-030);西安电子科技大学研究生创新基金(5001-20109215456)

Reciprocal bi-directional generative adversarial network for cross-modal pedestrian re-identification

WEI Ziyu1(),YANG Xi1(),WANG Nannan1(),YANG Dong2(),GAO Xinbo3()   

  1. 1. School of Telecommunications Engineering,Xidian University,Xi’an 710071,China
    2. Xi’an Institute of Space Radio Technology,Xi’an 710100,China
    3. Chongqing Key Laboratory of Image Cognition,Chongqing University of Posts and Telecommunications,Chongqing 400065,China
  • Received:2020-06-12 Revised:2020-01-06 Online:2021-04-20 Published:2021-04-28
  • Contact: Xi YANG

摘要:

为提高跨模态行人重识别的准确率,提出一种基于互惠双向生成对抗网络的跨模态行人重识别方法。首先,建立两个生成对抗网络以生成跨模态异质图像;其次,设计一种联合损失在可见光与红外图像相互转换过程中拉近隐藏空间特征的分布,促使网络生成更接近真实图像的伪异质图像;最后,通过将原始图像与生成的异质行人图像相结合并输入至区别性特征提取网络中,使得不同模态的图像统一至相同模态,消减了跨模态差异。利用表征学习与度量学习,使网络提取出更具有判别性的行人特征。通过在跨模态数据集SYSU-MM01和RegDB上做对比实验,分析了该方法在不同损失函数下的识别效果。对比于其他前沿跨模态行人重识别方法,这种方法具有更高准确率和更强鲁棒性。

关键词: 生成对抗网络, 图像转换, 特征提取, 跨模态行人重识别

Abstract:

To improve the accuracy of cross-modal pedestrian re-identification,a reciprocal bi-directional generative adversarial network-based method is proposed.First,we build two generative adversarial networks to generate cross-modal heterogeneous images.Second,an associated loss is designed to pull close the distribution of features in latent space during the image translation between visible and infrared images so as to help the networks generate fake heterogeneous images that have high similarity with the real images.Finally,by concatenating the original and generated heterogeneous pedestrian images into the discriminative feature extraction network,images from different modalities can be unified into a common modality,thus suppressing the cross-modal gap.Representation learning and metric learning are utilized to achieve more discriminative pedestrian features.Comparative experiments are conducted on SYSU-MM01 and RegDB datasets to analyze the accuracy with different loss functions.Compared with other state-of-the-art cross-modal pedestrian re-identification methods,the proposed method achieves a higher accuracy and stronger robustness.

Key words: generative adversarial networks, image translation, feature extraction, cross-modal pedestrianre-identification

中图分类号: 

  • TN911.73