电子科技 ›› 2019, Vol. 32 ›› Issue (1): 38-41.doi: 10.16180/j.cnki.issn1007-7820.2019.01.008

• • 上一篇    下一篇

基于稀疏自动编码机的场景识别算法

谢林,李菲菲,陈虬   

  1. 上海理工大学 光电信息与计算机工程学院,上海 200093
  • 收稿日期:2018-12-28 出版日期:2019-01-15 发布日期:2018-12-29
  • 作者简介:谢林(1992-),男,硕士研究生。研究方向:计算机视觉与模式识别。|李菲菲(1970-),女,博士,教授。研究方向:多媒体信息处理、图像处理与模式识别、信息检索等。|陈虬(1972-),男,博士,教授,博士生导师。研究方向:图像处理与模式识别、计算机视觉、信息检索等。
  • 基金资助:
    上海市高校特聘教授(东方学者)岗位计划(ES2012XX);上海市高校特聘教授(东方学者)岗位计划(ES2014XX)

Scene Recognition Algorithm Based on Sparse Autoencoder

XIE Lin,LI Feifei,CHEN Qiu   

  1. School of Optical-Electrical and Computer Engineering,University of Shanghai for Science and Technology,Shanghai 200093,China
  • Received:2018-12-28 Online:2019-01-15 Published:2018-12-29
  • Supported by:
    The Program for Professor of Special Appointment (Eastern Scholar) at Shanghai Institutions of Higher Learning(ES2012XX);The Program for Professor of Special Appointment (Eastern Scholar) at Shanghai Institutions of Higher Learning(ES2014XX)

摘要:

针对场景识别中低级特征与高级概念之间的语义鸿沟问题,提出了一种基于稀疏自动编码机的场景识别方法。采用了稀疏自动编码机和空间金字塔池化相结合的特征编码技术。首先对场景图像提取局部的HOG特征,然后利用改进的稀疏自动编码机对HOG特征进行编码,得到稀疏特征,通过空间金字塔池化和局部归一化得到整张场景图像的表示,最后利用线性SVM实现分类。在标准的场景图像数据集Scene-15上进行的实验表明,该算法可以将识别的准确率提升至81.97%。

关键词: 场景识别, 稀疏自动编码机, 空间金字塔池化, 局部归一化, HOG特征, SVM

Abstract:

To narrow the gap between low-level features and high-level concepts in scene recognition, a new algorithm based on the sparse autoencoder was proposed. This algorithm adopted the feature encoding technique that combined the sparse autoencoder and spatial pyramid pooling. First of all, the local HOG descriptors were extracted from scene images, then they were encoded into sparse features by the modified sparse autoencoder. After spatial pyramid pooling and local normalization on these sparse features, the image representation can be obtained. Finally, linear SVM was utilized to implement scene recognition. The experimental results on Scene-15 dataset indicated that the recognition accuracy of this algorithm can be increased up to 81.97%.

Key words: scene recognition, sparse autoencoder, spatial pyramid pooling, local normalization, HOG, SVM

中图分类号: 

  • TP391