Electronic Science and Technology ›› 2020, Vol. 33 ›› Issue (12): 54-58.doi: 10.16180/j.cnki.issn1007-7820.2020.12.011

Previous Articles     Next Articles

Scene Recognition Algorithm Based on Convolutional Neural Networks and Multi-Scale space Encoding

MIAO Ran,LI Feifei,CHEN Qiu   

  1. School of Optical-Electrical and Computer Engineering,University of Shanghai for Science and Technology,Shanghai 20093,China
  • Received:2019-09-14 Online:2020-12-15 Published:2020-12-22
  • Supported by:
    The Program for Professor of Special Appointment (Eastern Scholar) at Shanghai Institutions of Higher Learning(ES2015XX)


A scene image is generally composed of some foreground objects and background contexts with a certain spatial layout. Due to different scales, viewpoints, and backgrounds, there exists large intra-class variation within the same scene class. On the other hand, the common objects also result in a certain inter-class similarities among heterogeneous scenes as well. Consequently, the multi-scale space encoding based on convolutional neural networks (CNN) for scene representation is proposed in the study, which combines multi-scale dense sampling method, CNN algorithm, and multi-scale space encoding method. The multi-scale encoding method spatially partitions the sampling grid many times, and then aggregates the CNN features within sub-regions with different shapes for generating the multi-scale space VLAD. The experiment is carried out on the Scene15 scene dataset, and the test results show that the test accuracy reaches 94.67%.

Key words: scene recognition, convolutional neural networks, K-means clustering, VLAD, PCA, SVM

CLC Number: 

  • TP391