电子科技 ›› 2025, Vol. 38 ›› Issue (3): 47-59.doi: 10.16180/j.cnki.issn1007-7820.2025.03.007

• • 上一篇    下一篇

融合CNN与Transformer的视网膜OCT图像积液分割方法

陈宇洋(), 李峰   

  1. 上海理工大学 光电信息与计算机工程学院,上海 200093
  • 收稿日期:2023-09-03 修回日期:2023-10-05 出版日期:2025-03-15 发布日期:2025-03-11
  • 通讯作者: 陈宇洋(1996-),男,E-mail:chenyuyang096@163.com,硕士研究生。研究方向:深度学习、医学图像处理。
  • 作者简介:李峰(1983-),男,博士,副教授。研究方向:人工智能。
  • 基金资助:
    国家重点研发计划(2020YFC2008704)

Integration of CNN and Transformer for Retinal OCT Image Fluid Segmentation Method

CHEN Yuyang(), LI Feng   

  1. School of Optical-Electrical and Computer Engineering,University of Shanghai for Science and Technology,Shanghai 200093,China
  • Received:2023-09-03 Revised:2023-10-05 Online:2025-03-15 Published:2025-03-11
  • Supported by:
    National Key R&D Program of China(2020YFC2008704)

摘要:

针对积液区域尺寸小、形状异质、细节模糊等问题,文中将卷积神经网络(Convolutional Neural Networks, CNN)和Transformer相融合,提出了一种创新的多分支分割网络。该网络包括全卷积路径、Transformer路径和CNN-Transformer融合路径3个关键路径。全卷积路径用于捕获病变区域的细节特征,Transformer路径提取了具有长范围依赖的多尺度非局部特征信息。融合路径同时利用了CNN和Transformer的优势弥补其他分支的不足之处,通过预测头整合3个分支的特征生成最终的分割图。在Kermany数据集、UMN数据集和DUKE数据集上针对视网膜内积液和视网膜下积液进行了视网膜积液分割性能测试。实验结果表明,所提方法的Dice系数为86.63%,交并比为77.02%,灵敏度为89.47%,精确度为85.51%,证明了其有效性,为视网膜积液自动分割问题提供了一种可行的解决方案。

关键词: 视网膜OCT图像, 卷积神经网络, Transformer, 分割网络, IRF, SRF, 视网膜积液, 注意力机制

Abstract:

In view of the problems such as small size, heterogeneous shape and fuzzy details of the fluid accumulation area, this study integrates CNN(Convolutional Neural Networks) and Transformer to propose an innovative multi-branch segmentation network. The network consists of full convolutional path, Transformer path and CNN-Transformer fusion path. The fully convolutional path is used to capture detailed features of the lesion area, while the Transformer path extracts multi-scale non-local feature information with long-range dependencies. The fusion path takes advantage of both CNN and Transformer to make up for the shortcomings of other branches. The features of the three branches are integrated through the prediction head to generate the final segmentation map. The performance of retinal effusion segmentation is tested on Kermany, UMN and DUKE data sets for intraretinal effusion and subretinal effusion. The experimental results show that the Dice coefficient of the proposed method is 86.63%, the crossover ratio is 77.02%, the sensitivity is 89.47%, and the accuracy is 85.51%, which proves its effectiveness and provides a feasible solution for the automatic segmentation of retinal effusion.

Key words: retinal OCT images, convolutional neural network, Transformer, segmentation network, IRF, SRF, retinal effusion, attention mechanism

中图分类号: 

  • TP391