电子科技 ›› 2022, Vol. 35 ›› Issue (7): 27-31.doi: 10.16180/j.cnki.issn1007-7820.2022.07.005

• • 上一篇    下一篇

基于Tesseract_OCR的化工包装袋喷码质量检测算法

张茂林1,叶轻舟1,潘鑫2,陆华3   

  1. 1. 福建工程学院 电子电气与物理学院,福建 福州350118
    2. 福建工程学院 计算机科学与数学学院,福建 福州 350118
    3. 福州三龙喷码科技有限公司,福建 福州 350014
  • 收稿日期:2021-02-05 出版日期:2022-07-15 发布日期:2022-08-16
  • 作者简介:张茂林(1990-),男,讲师。研究方向:机器视觉、人工智能。|叶轻舟(1968-),男,教授。研究方向:物联网、图像处理与模式识别。
  • 基金资助:
    国家自然科学基金(41971340);国家自然科学基金(41471333);福建省科学计划项目(2018H001);福建省科学计划项目(2019I0019)

Quality Inspection Algorithm of Chemical Packaging Bag Coding Based on Tesseract_OCR

ZHANG Maolin1,YE Qingzhou1,PAN Xin2,LU Hua3   

  1. 1. School of Electronic, Electrical Engineering and Physics,Fujian University of Technology,Fuzhou 350118,China
    2. School of Computer Science and Mathematics,Fujian University of Technology,Fuzhou 350118,China
    3. Fuzhou Sunlong Inkjetprint Technology Co., Ltd.,Fuzhou 350014,China
  • Received:2021-02-05 Online:2022-07-15 Published:2022-08-16
  • Supported by:
    National Natural Science Foundation of China(41971340);National Natural Science Foundation of China(41471333);Project of Fujian Provincial Department of Science and Technology(2018H001);Project of Fujian Provincial Department of Science and Technology(2019I0019)

摘要:

化工包装袋喷印信息存在人工检测效率低、漏检率高的问题。针对该问题,文中设计了一种基于机器视觉的化工包装袋喷码质量检测方法。使用均值滤波与高斯双边滤波算法对采集图像进行预处理,再通过基于局部统计的可变阈值算法进行字符区域定位。为解决喷码字符点与点的距离大于字符之间的间隙大小,导致二值图像闭运算后多个字符黏连形成连通域的问题,文中提出了一种改进连通域的动态字符分割算法,通过Tesseract_OCR引擎将分割的字符图像进行分类训练和识别。实验结果表明,该算法对喷码质量检测的精确率高达95.62%,满足化工包装袋喷码质量检测要求。

关键词: 机器视觉, Tesseract_OCR, 化工包装袋, 喷码质量检测, 预处理, 字符定位, 改进连通域, 字符分割

Abstract:

For the problems of low efficiency and high leakage rate in manual quality inspection of information printing on chemical packaging bags, a machine vision-based quality inspection method for printing codes on chemical packaging bags is designed in this study. Mean filtering and Gaussian bilateral filtering algorithms are used to pre-process the captured image, and then the character area is localized through a variable threshold algorithm based on local statistics. To solve the problem that the distance between the dots of the printout characters may be larger than the gap size between the characters, which leads to the formation of a connected domain with multiple characters sticking together after the binary image closure operation, the study proposes a dynamic character segmentation algorithm to improve the connected domain. The segmented character images are trained and recognized by Tesseract_OCR engine for classification. The experimental results show that the algorithm has the accuracy rate of 95.62% for coding quality detection, which can meet the requirements of chemical packaging bag coding quality inspection.

Key words: machine vision, Tesseract_OCR, chemical packaging bags, coding quality inspection, preprocessing, character localization, improved concatenated domain, character segmentation

中图分类号: 

  • TP391.41