西安电子科技大学学报 ›› 2022, Vol. 49 ›› Issue (6): 129-138.doi: 10.19665/j.issn1001-2400.2022.06.016

• 计算机科学与技术 & 人工智能 • 上一篇    下一篇

VAE-Fuse:一种无监督的多聚焦融合模型

邬开俊(),梅源()   

  1. 兰州交通大学 电子与信息工程学院,甘肃 兰州 730070
  • 收稿日期:2022-01-18 出版日期:2022-12-20 发布日期:2023-02-09
  • 通讯作者: 梅源(1998—),男,兰州交通大学硕士研究生,E-mail:meiyuan2551161628@163.com
  • 作者简介:邬开俊(1978—),男,教授,博士,E-mail:wkj@mail.lzjtu.cn
  • 基金资助:
    甘肃省自然科学基金(21JR7RA300);甘肃省敦煌文化遗产保护研究中心开放项目(o.Gdw2021Yb15)

VAE-Fuse:an unsupervised multi-focus fusion model

WU Kaijun(),MEI Yuan()   

  1. School of Electronics and Information Engineering,Lanzhou Jiaotong University,Lanzhou 730070,China
  • Received:2022-01-18 Online:2022-12-20 Published:2023-02-09

摘要:

在多聚焦图像融合问题中,为了尽可能多地保留原始图像信息并提升图像融合的质量,首先结合变分自编码器结构及无参考图像清晰度评价指标中的灰度方差乘积函数,设计了一种基于无监督学习的双阶段图像融合网络;然后在训练阶段,提出使用多尺度结构相似度作为损失函数并引入了总偏差损失对图像中存在的噪声进行抑制;接着构建了一种基于变分自编码器结构的编码器-解码器网络进行原始图像的重构任务训练;再次在融合阶段,使用训练好的编码器对待融合图像进行特征编码后,使用改进的灰度方差乘积函数方法进行清晰像素的判别任务;最后通过数学形态学优化处理后生成最终的决策图,采用加权融合策略完成图像的最终融合。实验结果表明,此方法虽然采用了更少的模型参数,但是在编码解码过程中保留了更多的原始图像信息,在像素判别过程中优于传统的基于空间频率的判别方法。在与多种具有代表性的图像融合方法相比中,所提出的方法在主观和客观评价方面均取得了先进的融合性能。

关键词: 多聚焦图像融合, 无监督学习, 变分自编码器, 灰度方差乘积

Abstract:

In the multi-focus image fusion problem,in order to preserve as much original image information as possible and improve the quality of image fusion,a two-stage image fusion network based on unsupervised learning is designed by combining the variational autoencoder structure and the gray variance product function in the no-reference image clarity evaluation index.In the training phase,the multi-scale structural similarity is proposed as the loss function and the total deviation loss is introduced to suppress the noise in the image.An encoder-decoder network based on the variational autoencoder structure is constructed to train the original image reconstruction task.In the fusion stage,after using the trained encoder to encode the features of the fused image,the improved gray variance product function method is used to distinguish the clear pixels.The final decision map is generated by mathematical morphology optimization.Finally,the weighted fusion strategy is used to complete the final fusion of the image.Experimental results show that although this method uses fewer model parameters,it retains more original image information in the encoding and decoding process,and is superior to the traditional spatial frequency-based discrimination method in the pixel discrimination process.Compared with a variety of representative image fusion methods,the proposed method has achieved a superior fusion performance in both subjective and objective evaluation.

Key words: multi-focus image fusion, unsupervised learning, variational autoencoder, product of gray variance

中图分类号: 

  • TP391