Journal of Xidian University ›› 2022, Vol. 49 ›› Issue (6): 103-110.doi: 10.19665/j.issn1001-2400.2022.06.013

• Computer Science and Technology & Artificial Intelligence • Previous Articles     Next Articles

High efficient framework for large-scale zero-shot image recognition

ZHANG Zehuan1,2(),LIU Qiang1,2(),GUO Difei3()   

  1. 1. School of Microelectronics,Tianjin University,Tianjin 300072,China
    2. Tianjin Key Laboratory of Imaging and Sensing Microelectronic Technology,Tianjin 300072,China
    3. Tianjin Communication & Broadcasting Group Co.,Ltd.,Tianjin 300140,China
  • Received:2022-01-07 Online:2022-12-20 Published:2023-02-09

Abstract:

For large-scale zero-shot image recognition tasks,because of a large number of classes,model training is difficult and training costs of the model are high.In order to solve those problems,this paper designs a high-efficient zero-shot learning framework,which improves the accuracy and generalization ability at low training costs.This framework designs the joint space,uses the image branch network and the semantic branch network to map different modal vectors to the joint space to complete model training and inference.In the image branch network,in order to change the distribution of image feature vectors,this paper uses the perceptron network to map image feature vectors to the joint space.In the semantic branch network,graph convolutional networks are used to map semantic vectors to the joint space.In addition,the loss function is designed to constrain the joint space,so that the discrimination of different classes in the joint space is increased,which is conducive to model training.Experimental results on the ImageNet show that on the “2-HOPS” test set,compared with existing methods without fine-tuning,the accuracy of our algorithm increases by 1.1%,and the training time decreases by 57.8%;compared with existing algorithms after fine-tuning,the accuracy of our algorithm saves 98.4% of training time without any loss of accuracy.Experimental results show that the method improves the model performance with low training costs.

Key words: deep learning, knowledge graph, graph neural networks

CLC Number: 

  • TP183