Electronic Science and Technology ›› 2024, Vol. 37 ›› Issue (10): 30-39.doi: 10.16180/j.cnki.issn1007-7820.2024.10.005

Previous Articles     Next Articles

Detection of Cervical Lesions Based on Multi-Scale Features and Attention Mechanism

FENG Ting1, YING Jie1, YANG Haima1, LI Fang2   

  1. 1. School of Optical-Electrical and Computer Engineering,University of Shanghai for Science and Technology, Shanghai 200093,China
    2. School of Medical,Tongji University,Shanghai 200120,China
  • Received:2023-03-06 Online:2024-10-15 Published:2024-11-04
  • Supported by:
    Shanghai Science and Technology Innovation Action Plan(21S31904200);Shanghai Science and Technology Innovation Action Plan(22S31903700)

Abstract:

CIN(Cervical Intraepithelial Neoplasm) is a precancerous lesion of the cervix with a high correlation to invasive cervical cancer. Accurate detection and classification of CIN is helpful to reduce the rate of severe cervical cancer. YOLOv5-CBTR(You Only Look Once version 5-Convolutional Block Transformer) cervical lesion detection method is proposed to address the issues of low accuracy in detection and classification of cervical lesions by combining multi-scale features and multiple attention mechanisms. The backbone network employs the SE-CSP (SENet-BottleneckCSP) with SENet (Squeeze-and-Excitation Networks) attention mechanism for feature extraction. The Transformer encoder module is introduced to fuse and amplify multi-feature information, and multi-head attention mechanism is used to enhance the feature extraction ability of lesion regions. Convolutional attention modules are introduced into the feature fusion layer for multiscale fusion of lesion feature information. The power transformation is introduced into the calculation of the boundary regression box, which speeds up the convergence of the model's loss function and realizes the detection and classification of cervical lesions. The experimental results show that the accuracy, recall rate, mAP(mean Average Precision), and F value of YOLOv5-CBTR model for the detection and classification of RGB cervical lesion images are 93.99%, 92.91%, 92.80%, and 93.45%, respectively. The mAP and F values of the model in multispectral cervical image detection and classification are 97.68% and 95.23%, respectively.

Key words: cervical image, lesion detection, multiscale features, attention mechanism, multispectral image, transformer encoder, power transformation, deep learning

CLC Number: 

  • TP391