J4

• 研究论文 • 上一篇    下一篇

表格图像的MAZ分割方法

王泉1;王来敬2;万波1
  

  1. (1. 西安电子科技大学 计算机外部设备研究所,陕西 西安 710071;
    2. 河南师范大学 物理与信息工程学院,河南 新乡 453007)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2008-04-20 发布日期:2008-03-28
  • 通讯作者: 王泉

MAZ segmentation approach to the table-form image

WANG Quan1;WANG Lai-jing2;WANG Bo1
  

  1. (1. Research Inst. of Computer Peripherals, Xidian Univ., Xi′an 710071, China;
    2. College of Physics and Information Engineering, Henan Normal Univ., Xinxiang 453007, China)
  • Received:1900-01-01 Revised:1900-01-01 Online:2008-04-20 Published:2008-03-28
  • Contact: WANG Quan

摘要: 根据表格图像直线交点特征以及表格中标题域与数据域的依赖关系,将表格的布局划分为6种基本结构,并以此提出了点的极大从属区域(MAZ)的定义.在此基础上,提出了一种基于MAZ的表格图像分割算法.该算法不仅能够实现对已填充表格的逻辑结构分析,而且可以按照基本的布局结构进行分割,将相互依赖的单元格划分在同一个子表中.实验结果证明了文中方法的有效性.

关键词: 直线交点特征, 基本布局结构, 表格图像分割

Abstract: According to the features of line intersection and the dependent relations between the name fields and data fields, the layout structure is classified into six classes, and then a definition of the Maximum Attributive Zone(MAZ) is presented. Based on the MAZ, an algorithm for the table-form image segmentation is presented. Not only can this method analyse the filled-in table-form images logical layout, but also it may divide the interdependent units into the same sub-tables according to the basic layout fragments. The experimental results show that this algorithm is effective.

Key words: line intersection features, base layout structure, table image segmentation

中图分类号: 

  • TP391.41