中国科学院软件研究所机构知识库
Advanced  
ISCAS OpenIR  > 中科院软件所  > 中科院软件所
题名:
表格识别系统应用中若干问题的研究
作者: 卜飞宇
答辩日期: 2004
专业: 计算机应用技术
授予单位: 中国科学院软件研究所
授予地点: 中国科学院软件研究所
学位: 博士
关键词: 表格识别 ; 表格与图形鉴别 ; 表格框线去除 ; 图像倾斜检测与校正
其他题名: A Study on Some Problems in The Application of Form Recognition System
摘要: 表格是一种常见的文档形式,广泛地应用于人们的日常工作和生活之中。随着计算机技术的发展,利用计算机获取、存储和管理数量巨大的表格信,息已越来越成为人们关注的焦点。表格识别系统已开始成为替代人工录入、自动获取表格信息的一种有效工具。针对现有表格识别系统在应用中遇到的一些问题,本文对表格与图形的鉴别、彩色票据图像表格框线的去除、灰度与彩色表格图像的倾斜角度检测等几个问题进行了深入研究,并取得如下一些成果:1、现有系统中,鉴别表格与图形的误判率较高。本文提出了一种根据表格框线和单元信息来区分表格与图形的方法,该方法结合表格的结构特征,提出了作为表格要素的表格框线和表格单元所必须满足的若干约束条件,通过验证每个条件是否得到满足来区分表格与图形。实验表明,该方法能有效地降低对表格与图形的误判率。2、字线交迭严重干扰对字符的切分与识别。以前的基于二值图像的表格框线去除算法,只能在一定程度上排除表格框线对字符识别的干扰。随着计算机运算速度和存贮容量的迅速提高,表格识别系统的扫描输入图像开始采用灰度和彩色图像。本文提出了一种基于彩色图像的表格框线去除算法,由于利用了彩色和灰度信息,能更好的排除表格框线对字符识别的干扰。该方法目前已成功地应用于银行票据识别系统中。3、为解决灰度和彩色票据图像倾斜问题,本文提出了一种根据扫描时产生的黑色边缘来检测扫描图像倾斜角度的方法。该方法根据检测出的四条边缘拟合直线来确定图像倾斜角度。实验表明,该方法具有很快的速度和很高的正确率,且适应于所有白色(浅色)矩形纸张扫描的灰度和彩色图像。目前,该方法已用于彩色银行票据和灰度名片图像的倾刹校正与去除黑边。
英文摘要: Form is widely used to collect and distribute data in daily office operations. Using computer to capture, store and manage large volume form document, is becoming more and more important. Instead of manual input, form recognition system is becoming an effective tool to capture form information now. Concerned with some problems in the application of form recognition system, the author has made some research in the following respects: distinguishing tables from graphics; removing form frame line from color financial bill images; detecting skew angle of gray and color form images. Here is a report of the results of the research. 1). In order to avoid some classified errors, this paper presents a method to distinguish tables from graphics based on the structural constrained information of table frame lines and cells. According to the structure of a table, some necessary restrictions that must be satisfied by all frame lines and cells in a table are presented in this paper. And we verify all these restrictions to distinguish tables from graphics. Experiments show that this method is effective. 2). Characters often overlap form frame lines. Such overlapping seriously deteriorates the recognition of characters. Almost all form frame line removal algorithms based on binary image, and these algorithms have some limitations. A new form frame line removal algorithms based on color images is presented in this paper. Because of using color and gray information of images, this method can avoid the effect of overlapping better. The effectiveness of this method is proved by application of financial document recognition system. 3). According to the need of financial document recognition system, this paper presents a new skew detection and correction method based on black border of financial bill gray scan images. This method decides the skew angle of a bill image according to four border fitting lines of the bill. Experiments show that this approach is fast, accurate and effective. This algorithm can be extended and applied to other gray and color scan images of rectangular white paper.
语种: 中文
内容类型: 学位论文
URI标识: http://ir.iscas.ac.cn/handle/311060/6218
Appears in Collections:中科院软件所

Files in This Item:
File Name/ File Size Content Type Version Access License
LW014104.pdf(2562KB)----限制开放-- 联系获取全文

Recommended Citation:
卜飞宇. 表格识别系统应用中若干问题的研究[D]. 中国科学院软件研究所. 中国科学院软件研究所. 2004-01-01.
Service
Recommend this item
Sava as my favorate item
Show this item's statistics
Export Endnote File
Google Scholar
Similar articles in Google Scholar
[卜飞宇]'s Articles
CSDL cross search
Similar articles in CSDL Cross Search
[卜飞宇]‘s Articles
Related Copyright Policies
Null
Social Bookmarking
Add to CiteULike Add to Connotea Add to Del.icio.us Add to Digg Add to Reddit
所有评论 (0)
暂无评论
 
评注功能仅针对注册用户开放,请您登录
您对该条目有什么异议,请填写以下表单,管理员会尽快联系您。
内 容:
Email:  *
单位:
验证码:   刷新
您在IR的使用过程中有什么好的想法或者建议可以反馈给我们。
标 题:
 *
内 容:
Email:  *
验证码:   刷新

Items in IR are protected by copyright, with all rights reserved, unless otherwise indicated.

 

 

Valid XHTML 1.0!
Copyright © 2007-2017  中国科学院软件研究所 - Feedback
Powered by CSpace