Successful Recognition Strategies: Data Review and Entry is Key to Useful Data
Even though there is a lot of vendor marketing touting use of artificial intelligence to achieve 99.9 percent accuracy for data extracted from forms and documents, there is still a significant component needed to achieve that high number—humans.
Even computers make mistakes, and in order to ensure that your data is the level of quality that you need, people must review the output of computers. And to achieve accuracy rates as high as 99 percent requires people involved at multiple levels of review, not only to ensure that the computer didn’t make a mistake, but to ensure that the person reviewing the computer didn’t make a mistake!
This doesn’t mean that these additional levels of quality assurance need to be time consuming and costly. The best operations optimize both the amount of accurate data automatically extracted through machine learning techniques and optimize workflows designed to ensure all data is as accurate as possible. In fact, modern automated data entry workflows can eliminate up to 90 percent of existing forms processing keystrokes.
Here is how it is possible to achieve this on your own.
Superior Data Cleanliness
Obviously the first stop of this journey involves maximizing the chance that data recognition will take most of the burden. Well-designed forms result in better data gathering and ultimately better data recognition and helps with manual review when fields are larger and easier to locate.
The second area deals with processes that result in more manageable batches that reduce time or effort when problems arise.
The third deals with the actual keying workflows and is where a lot of efficiency can be gained.
Three Basic Rules
Incorporating three rules into the process provides consistency, a higher probability of increased accuracy and creates a foundation to best utilize resources and save money. Here’s how it works.
Rule 1: Optimize Paper Form Design
Superior data capture begins with optimizing the design of the paper form. A few design elements can make a world of difference. Be sure to clearly communicate instructions to end-users through obvious data labels and field identifiers. Leave ample space for the user to completely fill out information so they can write legibly. Note that email addresses are the most difficult fields to capture accurately so make it easy for the user by leaving plenty of room for the proper address, pre-print the “@” sign and leave plenty of space for the domain name and extension. It’s best to use single-sided forms printed on quality paper so the document holds up through mail delivery and imaging. If multi-part forms are necessary, try to keep pertinent pages together in one form package. Finally, using big, clear field boxes as part of the design help guide users in filling out the form and drop-out ink methods available in more sophisticated forms processing tools ensure that the guidelines will not interfere with capturing data.
Where it is not possible to influence form design, there are still some tricks to ensure that both the recognition and data validation process are as efficient as possible. For instance, ensuring that you can achieve a maximum contrast between the image background and the form data (both the form itself and the completed fields) will help with any image cleanup that is done prior to form recognition. Another technique can be to set-up field-level template removal in order to leave only the data. Performing this at a field level is essential where some images can have skew or scaling problems.
Rule 2: Discipline in Batching and Imaging Techniques
Once the forms have been properly designed, it is critical to instill discipline into the batching and imaging process. Workshops to train your imagers help ensure accuracy in data handling and this diligence will pay dividends later on in the resulting data integrity. There are two key factors to consider in batching and imaging. First, image forms in batches that are a manageable size. This isolates problems such as lost forms or unreadable fields. Second, make sure that naming conventions reflect as much information as possible including: data content, project ID, as well as date and timestamps. Taking the time to infuse these steps into the batching and imaging process greatly facilitates status tracking and significantly aids in any trouble-shooting.
Rule 3: Capture Data Right the First Time
For fields that must be keyed (when a customer demands double-pass keying for example), efficient data entry presentation goes a long way toward quick and accurate data capture. One way to most efficiently present data is to provide keying tools that present “like-type” fields to operators rather than full form keying. This data entry feature helps establish rhythm and consistency, increasing accuracy and speed. A significant added bonus to using like-type fields is heightened customer security. Here’s how it works: a given operator only keys first names while another enters social security numbers. Both fields are never put together, maintaining client privacy.
In addition to technology, instilling efficient multi-pass capture techniques helps ensure that data starts out clean. As a rule of thumb, remember that one-pass keying typically yields 95 percent accuracy in reads while double-pass keying increases the accuracy to 99.5 percent.
Following these three sets of rules, making the most of each step and taking a little extra time to design, batch and image the forms correctly will pay big dividends in the end product. Integrating modern validation tools for operators increases speed and accuracy. Sophisticated forms processing software that includes OCR technology, intelligent keying schemes and data output manipulation most effectively automates forms processing. The Bottom Line—a consistent and logical approach to managing all steps will ultimately increase data integrity—and save money.