Electronic Science and Technology ›› 2025, Vol. 38 ›› Issue (3): 47-59.doi: 10.16180/j.cnki.issn1007-7820.2025.03.007
Previous Articles Next Articles
Received:
2023-09-03
Revised:
2023-10-05
Online:
2025-03-15
Published:
2025-03-11
Supported by:
CLC Number:
CHEN Yuyang, LI Feng. Integration of CNN and Transformer for Retinal OCT Image Fluid Segmentation Method[J].Electronic Science and Technology, 2025, 38(3): 47-59.
Table 1.
Comparison with other networks"
网络 | IRF | SRF | 平均值 | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Dice | 交并比 | 敏感度 | 精确度 | Dice | 交并比 | 敏感度 | 精确度 | Dice | 交并比 | 敏感度 | 精确度 | |
FCN | 0.744 3 | 0.619 8 | 0.781 6 | 0.720 2 | 0.733 7 | 0.595 8 | 0.806 8 | 0.677 5 | 0.739 0 | 0.607 8 | 0.794 2 | 0.698 9 |
U-Net | 0.807 8 | 0.737 8 | 0.848 7 | 0.809 3 | 0.802 7 | 0.705 4 | 0.873 4 | 0.757 3 | 0.805 3 | 0.721 6 | 0.861 1 | 0.783 3 |
U-Net++ | 0.828 2 | 0.760 1 | 0.849 4 | 0.814 4 | 0.821 4 | 0.728 4 | 0.869 1 | 0.787 1 | 0.824 8 | 0.744 3 | 0.859 3 | 0.800 8 |
Attention U-Net | 0.831 9 | 0.764 4 | 0.851 2 | 0.817 1 | 0.810 3 | 0.714 4 | 0.847 4 | 0.787 2 | 0.821 1 | 0.739 4 | 0.849 3 | 0.802 2 |
CE-Net | 0.801 1 | 0.706 4 | 0.830 4 | 0.784 5 | 0.799 6 | 0.682 0 | 0.851 7 | 0.760 5 | 0.800 4 | 0.694 2 | 0.841 1 | 0.772 5 |
MsTGA-Net | 0.810 1 | 0.699 9 | 0.849 7 | 0.801 2 | 0.804 9 | 0.686 6 | 0.816 8 | 0.815 6 | 0.807 5 | 0.693 3 | 0.833 3 | 0.808 4 |
CPF-Net | 0.824 0 | 0.716 4 | 0.837 6 | 0.844 0 | 0.800 4 | 0.673 9 | 0.827 6 | 0.793 9 | 0.812 2 | 0.695 1 | 0.832 6 | 0.819 0 |
Y-Net | 0.839 0 | 0.779 1 | 0.850 3 | 0.831 9 | 0.822 4 | 0.728 5 | 0.861 2 | 0.793 5 | 0.830 7 | 0.753 8 | 0.855 8 | 0.812 7 |
本文网络 | 0.872 1 | 0.781 1 | 0.900 3 | 0.867 3 | 0.860 5 | 0.759 4 | 0.889 1 | 0.842 9 | 0.866 3 | 0.770 2 | 0.894 7 | 0.855 1 |
本文网络(数据增强) | 0.872 6 | 0.783 6 | 0.895 9 | 0.869 8 | 0.866 7 | 0.760 5 | 0.878 3 | 0.846 2 | 0.869 7 | 0.772 1 | 0.887 1 | 0.858 0 |
Table 2.
Results of ablation experiments"
网络 | IRF | SRF | 平均值 | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Dice | 交并比 | 敏感度 | 精确度 | Dice | 交并比 | 敏感度 | 精确度 | Dice | 交并比 | 敏感度 | 精确度 | |
全卷积网络分支 | 0.816 1 | 0.711 6 | 0.869 6 | 0.806 8 | 0.807 9 | 0.715 5 | 0.859 1 | 0.791 8 | 0.812 0 | 0.713 6 | 0.864 4 | 0.799 3 |
Transformer分支 | 0.833 1 | 0.730 4 | 0.814 5 | 0.860 8 | 0.823 7 | 0.710 9 | 0.848 4 | 0.822 9 | 0.828 4 | 0.720 7 | 0.831 5 | 0.841 9 |
全卷积+融合分支 | 0.844 9 | 0.745 1 | 0.852 7 | 0.860 0 | 0.841 5 | 0.732 2 | 0.875 8 | 0.820 7 | 0.843 2 | 0.738 7 | 0.864 2 | 0.840 4 |
本文网络 | 0.872 1 | 0.781 1 | 0.900 3 | 0.867 3 | 0.860 5 | 0.759 4 | 0.889 1 | 0.842 9 | 0.866 3 | 0.770 2 | 0.886 2 | 0.855 1 |
Table 3.
Comparison of Dice when tested in different data sets"
网络 | UMN | DUKE | ||
---|---|---|---|---|
IRF | SRF | IRF | SRF | |
FCN | 0.598 1 | 0.547 5 | 0.436 0 | 0.464 8 |
U-Net | 0.642 7 | 0.583 2 | 0.572 4 | 0.494 0 |
U-Net++ | 0.659 8 | 0.574 4 | 0.570 8 | 0.503 4 |
Attention U-Net | 0.659 2 | 0.626 9 | 0.539 5 | 0.517 3 |
CE-Net | 0.638 0 | 0.603 2 | 0.552 6 | 0.482 7 |
MsTGA-Net | 0.653 7 | 0.612 4 | 0.548 8 | 0.529 8 |
CPF-Net | 0.661 5 | 0.638 4 | 0.569 1 | 0.543 1 |
Y-Net | 0.650 5 | 0.569 3 | 0.579 2 | 0.534 6 |
本文网络 | 0.729 9 | 0.708 8 | 0.613 1 | 0.608 6 |
[1] | Varadarajan A V, Bavishi P, Ruamviboonsuk P, et al. Predicting optical coherence tomography-derived diabetic macular edema grades from fundus photographs using deep learning[J]. Nature Communications, 2020, 11(1):130-131. |
[2] | Xing G, Chen L, Wang H, et al. Multi-scale pathological fluid segmentation in OCT with a novel curvature loss in convolutional neural network[J]. IEEE Transactions on Medical Imaging, 2022, 41(6):1547-1559. |
[3] | Moura J, Novo J, Rouco J, et al. Automatic detection of blood vessels in retinal OCT images[C]. Corunna: Biomedical Applications Based on Natural and Artificial Computing: International Work-Conference on the Interplay Between Natural and Artificial Computation, 2017:3-10. |
[4] | Girish G N, Kothari A R, Rajan J. Marker controlled watershed transform for intra-retinal cysts segmentation from optical coherence tomography B-scans[J]. Pattern Recognition Letters, 2020, 13(9):86-94. |
[5] | Dodo B I, Li Y, Kaba D, et al. Retinal layer segmentation in optical coherence tomography images[J]. IEEE Access, 2019, 37(7):152388-152398. |
[6] | Kaur J, Kaur P. Automatedcomputer-aided diagnosis of diabetic retinopathy based on segmentation and classification using K-nearest neighbor algorithm in retinal images[J]. The Computer Journal, 2023, 66(8): 2011-2032. |
[7] | Lu D, Heisler M, Lee S, et al. Deep-learning based multiclass retinal fluid segmentation and detection inoptical coherence tomography images using a fully convolutional neural network[J]. Medical Image Analysis, 2019, 54(7):100-110. |
[8] | Chiu S J, Allingham M J, Mettu P S, et al. Kernel regression based segmentation of optical coherence tomography images with diabetic macular edema[J]. Biomedical Optics Express, 2015, 6(4):1172-1194. |
[9] | Sun Z, Chen H, Shi F, et al. An automated framework for 3D serous pigment epithelium detachment segmentation in SD-OCT images[J]. Scientific Reports, 2016, 6(1):21739. |
[10] | Ronneberger O, Fischer P, Brox T. U-Net:Convolutional networks for biomedical image segmentation[C]. Munich: Medical Image Computing and Computer-Assisted Intervention the Eighteenth International Conference, 2015:234-241. |
[11] | Roy A G, Conjeti S, Karri S P K, et al. ReLayNet:Retinal layer and fluid segmentation of macular optical coherence tomography using fully convolutional networks[J]. Biomedical Optics Express, 2017, 8(8):3627-3642. |
[12] | Venhuizen F G, Van Ginneken B, Liefers B, et al. Deep learning approach for the detection and quantification of intraretinal cystoid fluid in multivendor optical coherence tomography[J]. Biomedical Optics Express, 2018, 9(4): 1545-1569. |
[13] | Rashno A, Koozekanani D D, Drayna P M, et al. Fully automated segmentation of fluid/cyst regions in optical coherence tomography images with diabetic macular edema using neutrosophic sets and graph algorithms[J]. IEEE Transactions on Biomedical Engineering, 2017, 65(5):989-1001. |
[14] | 郑宗生, 唐鹏飞, 王振华, 等. 基于改进SOLO_v2的糖尿病黄斑水肿分割模型[J]. 中国医学物理学杂志, 2023, 40(1):24-30. |
Zheng Zongsheng, Tang Pengfei, Wang Zhenhua, et al. A novel model for diabetic macular edema segmentation based on improved SOLO_v2[J]. Chinese Journal of Medical Physics, 2023, 40(1):24-30. | |
[15] | Liu W, Sun Y, Ji Q. Mdan-unet:Multi-scale and dual attention enhanced nested u-net architecture for segmentation of optical coherence tomography images[J]. Algorithms, 2020, 13(3):60-61. |
[16] | Gao Z, Wang X, Li Y. Automatic segmentation of macular edema in retinal OCT images using improved U-Net++[J]. Applied Sciences, 2020, 10(16):5701-5702. |
[17] | Sappa L B, Okuwobi I P, Li M, et al. RetFluidNet:Retinal fluid segmentation for SD-OCT images using convolutional neural network[J]. Journal of Digital Imaging, 2021, 34(3):691-704. |
[18] | 王健. 基于U2-Net眼底OCT图像黄斑水肿的语义分割方法研究[D]. 哈尔滨: 黑龙江科技大学,2023:1-57. |
Wang Jian. Semantic sgmentation of mcular edema based on U2-Net fundus OCT image[D]. Harbin: Heilongjiang University of Science and Technology,2023:1-57. | |
[19] | He X, Fang L, Tan M, et al. Intra-and inter-slice contrastive learning for point supervised OCT fluid segmentation[J]. IEEE Transactions on Image Processing, 2022, 31(7):1870-1881. |
[20] | Li F, Pan W Z, Xiang W, et al. Automatic segmentation of multitype retinal fluid from optical coherence tomography images using semisupervised deep learning network[J]. British Journal of Ophthalmology, 2023, 107(9):1350-1355. |
[21] | Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]. Salt Lake City: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2018:7132-7141. |
[22] | 左斌, 李菲菲. 基于注意力机制和Inf-Net的新冠肺炎图像分割方法[J]. 电子科技, 2023, 36(2):22-28. |
Zuo Bin, Li Feifei. An effective segmentation method for COVID-19 CT image based on attention mechanism and Inf-Net[J]. Electronic Science and Technology, 2023, 36(2): 22-28. | |
[23] | Park J, Woo S, Lee J Y, et al. Bam:Bottleneck attention module[EB/OL].(2018-07-18)[2023-09-30]. https://arxiv.org/abs/1807.06514. |
[24] | Woo S, Park J, Lee J Y, et al. Cbam:Convolutional block attention module[C]. Salt Lake City:Proceedings of the European Conference on Computer Vision,2018:3-19. |
[25] | 于润润, 姜晓燕, 朱凯赢, 等. 基于上下文注意力机制的实时语义分割[J]. 电子科技, 2022, 35(12):57-63. |
Yu Runrun, Jiang Xiaoyan, Zhu Kaiying, et al. Real-time semantic segmentation based on contextual attention mechanism[J]. Electronic Science and Technology, 2022, 35(12):57-63. | |
[26] | Wang X, Girshick R, Gupta A, et al. Non-local neural networks[C]. Salt Lake City: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2018:7794-7803. |
[27] | Li K, Wang Y, Zhang J, et al. Uniformer:Unifying convolution and self-attention for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(10): 12581-12600. |
[28] | Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]. Long Beach: The Thirty-first Conference on Neural Information Processing System,2017:6000-6010. |
[29] | Dosovitskiy A, Beyer L, Kolesnikov A, et al. An image is worth 16×16 words:Transformers for image recognitionat scale[EB/OL].(2021-06-03)[2023-09-30]. https://arxiv.org/abs/2010.11929. |
[30] | Zheng S, Lu J, Zhao H, et al. Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers[C]. Online: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2021:6881-6890. |
[31] | Liu Z, Lin Y, Cao Y, et al. Swin transformer:Hierarchical vision transformer using shifted windows[C]. Online: Proceedings of the IEEE/CVF International Conference on Computer Vision,2021:10012-10022. |
[32] | Strudel R, Garcia R, Laptev I, et al. Segmenter:Transformer for semantic segmentation[C]. Online: Proceedings of the IEEE/CVF International Conference on Computer Vision,2021:7262-7272. |
[33] | Xie E Z, Wang W, Yu Z, et al. SegFormer:Simple and efficient design for semantic segmentation with transformers[C]. Online: The Thirty-fifth Conference on Neural Information Processing System,2021:12077-12090. |
[34] | Carion N, Massa F, Synnaeve G, et al. End-to-end object detection with transformers[C]. Online: European Conference on Computer Vision, Springer International Publishing,2020: 213-229. |
[35] | Chen J, Lu Y, Yu Q, et al. Transunet: Transformers make strong encoders for medical image segmentation[EB/OL]. (2021-02-08)[2023-09-30]. https://arxiv.org/abs/2102.04306. |
[36] | Gao Y, Zhou M, Metaxas D N. UTNet:A hybrid transformer architecture for medical image segmentation[C]. Strasbourg:Medical Image Computing and Computer Assisted Intervention:The Twenty-fourth International Conference, Strasbourg,2021:61-71. |
[37] | Zhou H Y, Guo J, Zhang Y, et al. nnFormer: Interleaved transformer for volumetric segmentation[EB/OL].(2022-02-04) [2023-09-30]. https://arxiv.org/abs/2109.03201. |
[38] | Huang S, Li J, Xiao Y, et al. RTNet:Relation transformer network for diabetic retinopathy multi-lesion segmentation[J]. IEEE Transactions on Medical Imaging, 2022, 41(6):1596-1607. |
[39] | Jiang S, Li J. TransCUNet:UNet cross fused transformer for medical image segmentation[J]. Computers in Biology and Medicine, 2022, 15(10):6207-6218. |
[40] | Wang M, Zhu W, Shi F, et al. MsTGANet:Automatic drusen segmentation from retinal OCT images[J]. IEEE Transactions on Medical Imaging, 2021, 41(2):394-406. |
[41] | Wang J, Huang Q, Tang F, et al. Stepwise feature fusion: Local guides global[C]. Singapore: International Conference on Medical Image Computing and Computer-Assisted Intervention,2022:110-120. |
[42] | Wang W, Xie E, Li X, et al. Pvt v2:Improved baselines with pyramid vision transformer[J]. Computational Visual Media, 2022, 8(3):415-424. |
[43] | Kermany D S, Goldbaum M, Cai W, et al. Identifying medical diagnoses and treatable diseases by image-based deep learning[J]. Cell, 2018, 172(5):1122-1131. |
[44] | UMN. University of minnesota dataset for detection ofunusual crowd activity[EB/OL]. (2006-05-30)[2023-09-30]. https://mha.cs.umn.edu/proj_events.shtml#crowd. |
[45] | Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation[C]. Boston: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2015:3431-3440. |
[46] | Oktay O, Schlemper J, Folgoc L L, et al. Attention U-Net: Learning where to look for the pancreas[EB/OL].(2018-05-20)[2023-09-30]. https://arxiv.org/abs/1804.03999. |
[47] | Zhou Z, Rahman Siddiquee M M, Tajbakhsh N, et al. Unet++:A nested u-net architecture for medical image segmentation[C]. Granada:Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support: The Fourth International Workshop,2018:3-11. |
[48] | Gu Z, Cheng J, Fu H, et al. Ce-Net:Context encoder network for 2d medical image segmentation[J]. IEEE Transactions on Medical Imaging, 2019, 38(10):2281-2292. |
[49] | Feng S, Zhao H, Shi F, et al. CPFNet:Context pyramid fusion network for medical image segmentation[J]. IEEE Transactions on Medical Imaging, 2020, 39(10):3008-3018. |
[50] | Farshad A, Yeganeh Y, Gehlbach P, et al. Y-Net:A spatiospectral dual-encoder network for medical image segmentation[C]. Singapore: International Conference on Medical Image Computing and Computer-Assisted Intervention,2022:582-592. |
[1] | ZHOU Bin, KAN Jiarong, CHEN Heming, CHEN Weiwei, LI Yan, XU Sudong. A Transformerless Half-Bridge Lithium Battery Equalizer Based on Phase-Shift Strategy [J]. Electronic Science and Technology, 2025, 38(2): 1-9. |
[2] | ZHENG Fangliang, WANG Yannian, LIAN Jihong, RUAN Pei. Face Image Super-Resolution Reconstruction Based on Conditional Priori Swin Transformer [J]. Electronic Science and Technology, 2025, 38(2): 35-41. |
[3] | XIONG Zhangliang, CHEN Suting, XUAN Zhibin, ZHAO Tingchen, LIU Jiajun, FU Mannan. Design of High Speed Encoding and Decoding Circuit for Isolation Driver Chip [J]. Electronic Science and Technology, 2025, 38(2): 84-92. |
[4] | LAI Ying, JU Zhiyong, YE Yuxin. A Vehicle Detection Algorithm Based on Improved YOLOv4 [J]. Electronic Science and Technology, 2025, 38(1): 81-87. |
[5] | KUAI Xinchen, LI Ye. Hybrid Image Super-Resolution Reconstruction with Multiple and Multi-Scale Attention [J]. Electronic Science and Technology, 2024, 37(9): 34-42. |
[6] | HE Zhiqiang, SUN Zhanquan. Swin-Transformer-Based Carotid Ultrasound Image Plaque Segmentation [J]. Electronic Science and Technology, 2024, 37(9): 48-56. |
[7] | TONG Zhaojing, JING Lifei, LAN Mengyue. A Bayesian Network Optimization Method for Transformer Fault Diagnosis [J]. Electronic Science and Technology, 2024, 37(8): 34-39. |
[8] | HE Xing, HUANG Yongming, ZHU Yong. Pavement Pothole Detection Method Based on Improved YOLOv5 [J]. Electronic Science and Technology, 2024, 37(7): 53-59. |
[9] | LIANG Qiyu, WANG Yonggang, QIU Shengshun. Overvoltage Suppression of Single Switch Resonant Pulse Power Supply [J]. Electronic Science and Technology, 2024, 37(6): 17-28. |
[10] | ZHAO Xu, HU Demin. Multi-Path Parallel Multi-Scale Feature Reuse for Remote Sensing Image Super-Resolution [J]. Electronic Science and Technology, 2024, 37(6): 61-68. |
[11] | YE Yuxin, JU Zhiyong, LAI Ying. Traffic Sign Detection Algorithm Incorporating Receptive Field Enhancement Module and Attention Mechanism [J]. Electronic Science and Technology, 2024, 37(6): 8-16. |
[12] | ZHU Zihao, SONG Yan. Lightweight Capsule Network Fusing Attention and Capsule Pooling [J]. Electronic Science and Technology, 2024, 37(5): 1-8. |
[13] | TONG Zhaojing, LAN Mengyue, JING Lifei. Research on Transformer Fault Diagnosis Based on Improved Bayesian Network [J]. Electronic Science and Technology, 2024, 37(5): 47-53. |
[14] | PANG Jiangfei, SUN Zhanquan. Multi-Encoder Transformer for End-to-End Speech Recognition [J]. Electronic Science and Technology, 2024, 37(4): 1-7. |
[15] | LIANG Chenye, ZHANG Xuanxiong. Research on Multiclass Garbage Classification Algorithm Based on Improved MobileNet Network [J]. Electronic Science and Technology, 2024, 37(4): 38-46. |
|