Electronic Science and Technology ›› 2022, Vol. 35 ›› Issue (4): 20-27.doi: 10.16180/j.cnki.issn1007-7820.2022.04.004

Previous Articles     Next Articles

Globaland Local Scene Representation Method Based on Deep Convolutional Features

Chaowei LIN,Feifei LI,Qiu CHEN   

  1. School of Optical-Electrical and Computer Engineering,University of Shanghai for Science and Technology,Shanghai 200093,China
  • Received:2020-11-21 Online:2022-04-15 Published:2022-04-15
  • Supported by:
    The Program for Professor of Special Appointment (Eastern Scholar) at Shanghai Institutions of Higher Learning(ES2015XX)


Scene Recognition is a fundamental task in computer vision. Different from image classification, scene recognition needs to take a comprehensive consideration of factors such as global layout information, local scene features, and object features, which leads to the poor performance of classic convolutional neural network for scene recognition. In order to solve this issue, this study proposes a global and local scene representation method based on deep convolutional features. The proposed method transforms deep convolutional features of scene image to generate a comprehensive representation for each image. Specifically, CAM is used to discovery local key regions, and LSTM is used to encode convolutional features extracted from local key regions to produce the local representation for scene images. Attention mechanism is adopted to fuse scene features and object features to form a global representation for scene images. Finally, the evaluation experiments are conducted on MIT indoor 67 data set and the results show that the test accuracy is up to 87.59% using the proposed method.

Key words: scene recognition, convolutional neural networks, convolutional features, feature transform, CAM, LSTM, attention mechanism, end-to-end network

CLC Number: 

  • TP391