J4 ›› 2012, Vol. 39 ›› Issue (6): 78-83.doi: 10.3969/j.issn.1001-2400.2012.06.013

• 研究论文 • 上一篇    下一篇

利用旋转归一化和粗匹配算法破解验证码

高海昌;樊晔;王伟   

  1. (西安电子科技大学 软件工程研究所,陕西 西安  710071)
  • 收稿日期:2011-07-15 出版日期:2012-12-20 发布日期:2013-01-17
  • 通讯作者: 高海昌
  • 作者简介:高海昌(1978-),男,副教授,E-mail: hchgao@xidian.edu.cn.
  • 基金资助:

    国家自然科学基金资助项目(60903198);中央高校基本科研业务费专项资金资助项目(72125274)

Breaking the CAPTHCHA with rotation normalization and the rough-matching algorithm

GAO Haichang;FAN Ye;WANG Wei   

  1. (Inst. of Software Engineering, Xidian Univ., Xi'an  710071, China)
  • Received:2011-07-15 Online:2012-12-20 Published:2013-01-17
  • Contact: GAO Haichang

摘要:

验证码是一种能够区分计算机程序和人类的图灵测试.提出了一种使用粗匹配序贯相似性检测算法来破解字符有旋转、非扭曲、无粘连的验证码的方法.通过对旋转字符的旋转角度归一化减少了模板的数量,使用粗匹配算法降低了单个模板匹配所用的时间,从而在速度和准确率方面有了很大提高.以网上的验证码为例,介绍了破解的4个阶段: 图像的预处理、提取字符、旋转字符和识别字符.实验结果表明,使用基于粗匹配的序贯相似性检测匹配算法,验证码破解成功率可以达到85%,平均破解一张验证码图片所需的时间为3.5s,远优于同类算法.

关键词: 验证码, 图像旋转, 模板匹配, 序贯相似性检测算法

Abstract:

The CAPTCHA is a Turing Test which is used to tell computers from humans. This paper presents the sequential similarity detection algorithm(SSDA) algorithm based on Rough-matching to crack the CAPTCHA with characters rotating and non-distorting. The characters are rotated to normalize the angle of rotation to reduce the template library, with the Rough-matching algorithm used to shorten the time consumption. Taking the CAPTCHA of a bank as an example, we document the four breaking parts, which are Preprocessing, Extraction, Rotation and Recognition. Experimental results show that the success rate of the SSDA matching algorithm based on coarse matching is 85%, and that the average breaking time is 3.5s. It is much better than a similar algorithm.

Key words: CAPTCHA, image rotation, template matching, sequential similarity detection algorithm