We sat down with Tatyana Vazulina, the Product Manager at Parascript, to find out about advanced capture and the critical role that context plays in successful and accurate document recognition. What follows is our edited interview that captures Tatyana’s latest thoughts on advanced capture.
Why is context so important in document recognition and capture?
Tatyana: Successful recognition really begins with context. And, what I mean by “context” is information or clues about the fields in a document that refine recognition and provide certain constraints to produce more accurate results. Say we have a document that may be poor quality print or contain handwritten fields with the contents: 3032489993. When context is provided such as this is a phone number and it’s numeric, then accurately recognizing the field contents is possible. It is clear that the 999 represent “nines” and not the letter “g”. Without context, it’s difficult to define what those digits represent. With context, they are immediately recognizable.
Does context matter in handwriting recognition?
Tatyana: We’ve seen context play a significant role in handwriting recognition. When people decipher hard-to-read handwriting, they tend to ignore poorly written individual characters. Instead, they concentrate on entire words and look at the document as a whole to identify information that helps narrow the possible options and determine what’s actually been written. That’s what our software does as well. Parascript’s exceptional recognition capabilities are based on the human perception of context.
When does context matter in handwriting recognition?
Tatyana: Almost always. Even the most simple form fields, such as “Date of Birth,” hold valuable context information. The date may be expressed in alpha or numeric form. The character style may be constrained or unconstrained hand-print or cursive handwriting depending on the type of document. The field may also be in a European, American or another type of date format. Often times, we can provide this type of context as a built-in feature of a specific field type. We do this for amounts and dates written on checks.
The more precise the context provided, the more restricted the range of possible answers, and this increases recognition read rates. Overly broad context increases the probability that a word that is irrelevant may be chosen as an answer. And yet, when context is too restricted, the correct word may be ignored. So it’s all a balancing act.
So where do we start in applying context to our document recognition and capture?
Tatyana: To achieve a high recognition rate, really each field type in a document must be defined by describing its content. The context description starts with choosing an appropriate field type. After the field type is specified, its name determines what additional information is available to adjust the recognizer for processing a stream of field images. For example, choosing the Field Type: Date Numeric specifies the set of characters allowed in field and the properties available for context adjustment—their names, default values and a list of possible values for each property.
Parascript provides these Field Types that apply to a variety of fields commonly found on forms. Optimizing recognition results requires selecting the Field Type that most closely corresponds to a field’s content. So, you always want to be as specific as possible. For instance, Field Type Numeric is less specific than Field Type Amount for a field that contains dollar amounts like $200. Similarly, a field that contains a person’s first, middle, and last names is best assigned the Field Type Full Name, not a more general Field Type like Alpha.
Selecting and applying context to each field of a document is the first step in successful ICR recognition to ensure accurate recognition results.
How do we provide the context necessary for the most accurate recognition and capture results?
Tatyana: So, as we’ve been discussing, the software must have context to determine what values should be contained in each field of a document or image. This context is provided through defining four critical areas:
- Field type that indicates the type of data in the field;
- Properties that are the parameters of the field; This can also include describing special formats such as an account number that always starts with two AN characters followed by 7 digits.
- Character style such as constrained handprint, unconstrained handprint, cursive, machine-print; and
- Vocabulary that contains probable values—including words that may be reasonable for the field.
After the field type is specified, its name determines what additional information is available to adjust the software for processing a stream of field images. When the field type, properties, character style and vocabulary are provided, they influence the accuracy and success of ICR recognition.
Having the full context ensures that the data extracted from documents or images can be more easily recognized and processed.