OCR and forms processing

OCR and forms processing

Schantz, Herb

To celebrate TAWPI’s 30the anniversary, we will examine many of the changes that have occurred since 1970. While OCR can be traced back to 1809, most of the significant innovations have happened in the past 30 years.

OCR and forms processing began in 1809 when the first patents for inventions that aided the blind were awarded. Eighty years later, Nipkow of Poland invented sequential scanning (raster scanning) which analyzed an image line by line, similar to today’s OCR and image scanners.

The first documented reference to a true optical reader (i.e., a machine that converts printed characters into code) was in 1912 when Mr. Goldberg patented a machine that read printed characters and converted them into telegraph code. Goldberg’s machine enabled telegraph messages to be read, encoded and transmitted without human intervention by converting the typed message into paper tape. This paper then generated the proper Morse code. In 1914, Fournier d’Albe invented the first operational OCR reading machine as an aid to the blind. This “Optophone” was a hand-held scanner, which emitted a “meaningful audio output” when it moved across a printed page.

In 1946, at Sarnoff Laboratories, RCA developed the “Electric Pencil” which performed many of the same functions as d’ Albe’s machine, but it was considerably more compact. As early as 1948, Mort Taube at the RCA Sarnoff Laboratories suggested that signals from the “Electric Pencil” could be coupled with a facsimile device to transmit a digitized printed page (image) to remote terminals. This is the earliest record of digital imaging.

David Shepard, who founded the Intelligent Machine Corporation (IMC), patented the first practical OCR reader for data entry in 1954. This was installed at Reader’s Digest where it converted typewritten documents into punch cards in the magazine’s subscription department. In 1962, Jacob Rabinow of Rabinow Engineering Company demonstrated the capability to optically recognize handprinted characters. Later that year Rabinow installed its first OCR system at Ryder Trucking. It optically read two lines of typed numeric data on waybills simultaneously.

In November 1961 Recognition Equipment was founded to design and build OCR systems that could read a variety of fonts and typestyles from real world papers.

From 1961 to 1964, Recognition researched and developed its classic reader, the “Electronic Retina Computing Reader” (ERCR) which was based on the knowledge that Israel Sheinberg gained as a medical student before he turned to engineering.

The first ERCR was installed at the Fireman’s Fund American Insurance Companies in 1964, where it processed multi-font insurance claims. That same year, an ERCR transaction processor system was installed at United Airlines where it read tissue-like airline tickets at speeds of up to 900 items a minute with 99.99% character accuracy.

IBM’s role in forms processing can be traced back to 1957 when they signed an agreement with IMR. In 1965, IBM installed the 1975, a highspeed multi-font reader that processed multi-font employee contributions from page-sized documents at the Social Security Administration.

Early forms processing systems were essentially intelligent digital (forms) processing systems that optically scanned and recognized machineprinted documents and pages, and converted the information into binary code.

Applications addressed by today’s automated data capture systems can be classified into three categories: data entry, transaction processing and desktop/workstation systems. The reading performance of these systems depends on many user-controlled parameters, including the font or fonts to be read, the application, the media used, document handling, forms design and the print quality of the data.

When print quality and forms design standards are met, reading performance is optimized. Typical character acceptance rates for single font machineprinted characters can range between 99% and 99.99%. Typical character acceptance rates for multi-font (machine printed) characters from documents in large systems (up to 10,000 different typewriters with up to 40 different fonts) can range from 97% to 99.5%. Typical handprint reading performance can range between 98% and 99.5%.

For the most part, forms processing applications are concerned with the capture of data from formatted sheets or pages where many lines of data are read. The data captured optically at reading rates of up to 3,600 characters per second can be:

1. A single machine printed font.

2. Several individual fonts.

3. Several fonts intermixed (multi-font).

4. Hand printed.

Output can be programmed to be in a digital computer format that’s compatible with the user’s computer system. Thus, data is converted in one automatic step directly from a printed humanreadable document to a digital format.

In most real world applications, it’s impossible to read 100% of printed information, so the unreadable data must be captured and converted into a compatible format. Today’s data entry systems have a total data entry capability, which operates in parallel with the computer controlled OCR sub-systems.

Data (entry) capture OCR applications are found in government, healthcare, printing, transport, retail, insurance and utilities. Perhaps the largest single application in “volume of documents processed” is check and remittance processing.

Most of the data read by OCR consists of carbon impressions on forms generated at the transaction source. These are typical real world documents because they’re handled extensively before being sent to the data center for processing. After being read optically, these forms can be sorted at speeds of up to 2,400 documents per minute. The transports used in these applications handle forms that range in weight from tissue (airline tickets) to card stock (U.S. Treasury checks).

Transaction processing applications are used primarily for banking, credit card, retailing, utilities, airlines, insurance and government applications. As with data entry, OCR usage is increasing annually, with banking applications growing the fastest.

These readers provide users with the best reading performance of both technologies, with exceptionally low rejection rates and high accuracy. They can read, endorse, cancel and number each document before it’s microfilmed and sorted into stacker pockets. The auxiliary functions of endorsing, canceling, barcoding and microfilming don’t degrade the reading or throughput performance of these systems.

Data capture OCR and forms processing were mature technologies in 1984. They have increased the productivity of information processing and business automation applications in the United States and around the world.

The 1980s business environment was referred to as the “age of instant response.” During the next decade (19912000), many administrative employees began working at home and went online when they needed to communicate with their offices as part of the integrated digital environment.

By 1995, sales of digital imaging systems and services in the US exceeded $8 billion. Most businesses were equipped with computers connected to communication networks. Online workstations and related equipment supplemented OCR, because OCR’s greatest strengths are its ability to enter data into a database quickly and it’s demonstrated reading accuracy.

Virtually every manufacturer of data processing systems and office equipment plans to be an aggressive contributor that meets the needs of the business environment…where highly skilled, highly paid, knowledge workers use their time and talents in the most efficient way possible. Data recognition technologies, combined with reliable paper handling, systems integration and powerful user-friendly software, have reduced costs and increased productivity.

Text Recognition

One of the most important and cost effective EDMS tools is OCR. Initially, digital image scanners could handle either text or images. As microprocessors became cheaper, faster and more sophisticated, suppliers and integrators began offering software solutions for text recognition OCR applications. Now virtually all EDMS systems use OCR text recognition to convert scanned image data into digital text files for document management, word processing, desktop publishing applications and automatic indexing (fuzzy scanning).

OCR “text-only” scanners read text faster than image scanners with auxiliary OCR software. Dedicated text scanners handle a wider variety of typefaces, line spacings and manual editing marks. OCR software recognizes patterns of dots (bits) from electronic bitmaps as complete characters and converts each character into ASCII.

Artificial Intelligence

Text scanning recognition systems can use artificial intelligence and neural networks to “learn” new typefaces and their ASCII equivalents. Other OCR recognition systems use built-in knowledge about typefaces and character shapes. Some OCR systems use dictionary matrixes to enhance character recognition. However, the dictionary matrixes can be biased for the typefaces scanned.

This class of scanners is used with EDMS systems for desktop publishing applications. They also include software for spell checking and flagging suspicious characters or words for offline correction. However, desktop publishing systems with Omnifont recognition software usually lack the reliability and precision required for data entry or automatic indexing applications where the data is non-contextual or numeric without check digits.

Desktop Publishing

OCR is often used where throughput is important and text needs to be read, recognized and output in ASCII. Most OCR readers identify characters in an electronic (LSI) recognition processor unit (RU or an FRU), which can be separate from the scanner. They then transmit text files in ASCII to the host computer for processing. Some generate word processing files with format information such as margins.

Database Recognition

A desktop text scanner with OCR text software is an effective way to convert typed or printed text material into a digital format for word processing or electronic databases. Creating a database application is time consuming and frustrating due to the amount of data entry needed to generate a useful database. When database information exists in printed form, an EDMS system can shortcut the data capture task by scanning the information directly into the computer in ASCII. Publishers of graphics packages are now standardizing file formats for the movement of data among publishing packages and various digital input/output devices.

All EDMS systems need an integrated processor to process images. This can be inside the scanner, or in an interface that plugs into the computer. The basic function of the system is to digitize, index, store and retrieve data.

Increased usage of “chips that see” has resulted in performance increases in addition to price reductions from more than $300 per chip to less than $10. Further advances in miniaturization have enabled recognology devices to become smaller, faster and more accurate.

Reliable and versatile OCR readers are now available for less than $500 per station. In the future, OCR systems will automatically “learn” how to identify and read complex fonts such as Kanji, Hebrew and Arabic.

While some systems now offer accurate recognition of real world handprinting, in the next few years, solutions will recognize unconstrained, unsegmented, handprinted characters in parallel with the accurate recognition of case effective cursive handwriting. In the next decade, data capture OCR, text recognition OCR and cursive OCR will all be resident in the same data capture/forms recognition/document management system.

The boundaries between these applications and OCR technologies are starting to blur as the global industry changes culture and fully adapts to the integrated digital environment.

Herb Schantz (hlsassoc@aol.coml 703-444-7037) CDM, PE, CDP is president of HLS Associates International.

Copyright Association for Work Process Improvement Apr 2000

Provided by ProQuest Information and Learning Company. All rights Reserved