Journal of Xidian University ›› 2024, Vol. 51 ›› Issue (1): 125-134.doi: 10.19665/j.issn1001-2400.20230304

• Computer Science and Technology • Previous Articles     Next Articles

Self-supervised contrastive representation learning for semantic segmentation

LIU Bochong(), CAI Huaiyu(), WANG Yi(), CHEN Xiaodong()   

  1. Ministry of Education Key Laboratory of Optoelectronic Information Technology,School of Precision Instrument and Optoelectronic Engineering,Tianjin University,Tianjin 300072,China
  • Received:2022-10-25 Online:2024-01-20 Published:2023-08-22
  • Contact: CAI Huaiyu E-mail:2020202002@tju.edu.cn;hycai@tju.edu.cn;koala_wy@tju.edu.cn;xdchen@tju.edu.cn

Abstract:

To improve the accuracy of the semantic segmentation models and avoid the labor and time costs of pixel-wise image annotation for large-scale semantic segmentation datasets,this paper studies the pre-training methods of self-supervised contrastive representation learning,and designs the Global-Local Cross Contrastive Learning(GLCCL) method based on the characteristics of the semantic segmentation task.This method feeds global images and a series of image patches after local chunking into the network to extract global and local visual representations respectively,and guides the network training by constructing loss function that includes global contrast,local contrast,and global-local cross contrast,enabling the network to learn both global and local visual representations as well as cross-regional semantic correlations.When using this method to pre-train BiSeNet and transfer to the semantic segmentation task,compared with the existing self-supervised contrastive representational learning and supervised pre-training methods,the performance improvement of 0.24% and 0.9% mean intersection over union(MIoU) is achieved.Experimental results show that this method can improve the segmentation results by pre-training the semantic segmentation model with unlabeled data,which has a certain practical value.

Key words: semantic segmentation, self-supervised representation learning, contrastive learning, deep learning

CLC Number: 

  • TP391.4