Handwriting Recognition | Knowledge Base | Definition
The Challenges
Transcribing handwritten information into something that a computer can read in the quickest time possible has been a challenge. I’ll take you through a bit of history of handwriting recognition and juxtapose that with where we are now. Things have been happening in terms of the technologies and performance that are exciting.
You can train software with computer vision to understand how to transcribe a font of machine-print texts into machine-readable information fairly easily. Even with thousands of fonts out there, training software for this is very doable, but let’s take handwriting and think of it as a font. There is a unique font for every person on the planet that writes information. So I have a different font from my brother and my wife and my friends; every one of us has fonts. And then, complexity is added with how that information is conveyed on documents (e.g., in boxes on a form).
Even with thousands of fonts out there, training software for this is very doable, but let’s take handwriting and think of it as a font. There is a unique font for every person on the planet that writes information.
Ideally, the underlying structure of the form drops out, leaving only information. But then, we have structures that can’t be removed from documents. To add to the complexity, freeform handwriting as well as cursive are often part of the mix. All these challenges make it an order of magnitude more difficult to transcribe handwritten information than machine-print fonts into machine-readable text.
Handwriting Recognition: Technological Progress
How do we overcome these challenges? Until the last couple of years, to find and extract specific handwritten information in a document, you needed to add context to it. Basically, context is knowledge about what needs to be extracted. It can be as simple as a concept of telling the system, it is a number vs. using the alphabet vs. a mix of alphabet and numbers, etc. More complex examples of context can be the use of regular expressions or pre-built dictionaries of words and phrases.
Examples of Adding Context
One example is account numbers in a system; you’ve got a database of those expected account numbers. You can use that, or you can bring to bear a dictionary of names or other types of common terms. In claims adjudication, we can take a list of known ailments or services popular with ICD 10 codes and use that to aid with recognition.
All these things apply to the scope of handwriting recognition, to aid with those recognizers and overcome this problem of a billion fonts – a billion different ways to write information with a lot of variance. Context information was added at recognition time to help these handwriting recognizers narrow down the millions of potential options or answers down to the most likely answers.
As you can imagine, identifying the context and applying that context requires you to carefully scope a project. In some cases, that’s just not possible. A perfect example is a handwritten vehicle identification number (VIN code or VIN number) where there can be quite a bit of variance. It’s alphanumeric. So we have this common challenge of identifying “O” and distinguishing it from zeroes and “S” from fives, etc.
As you can imagine, identifying the context and applying that context requires you to carefully scope a project. In some cases, that’s just not possible.
Handwriting recognition was successfully applied to very specialized cases such as check recognition or postal automation (where you can define the scope of address blocks) and have context. A lot of companies tried handwriting recognition in other types of applications believing that it was as simple to use as OCR and found that it was not so they just dropped it.
What’s Changed in Handwriting Recognition
So, what’s changed is the continued application of Moore’s law. It basically says the power of computing doubles every number of years. Computing power gives us is the ability to take things that would have been difficult to do because it takes a long time, and it makes it a lot easier.
Data Access & Computing Power
Computing power enables using lots of sample data necessary in machine learning, and machine learning is used within the confines of handwriting recognition. We train systems, computer vision, to understand what is on a written page.
Handwriting has a lot of variance so the more examples of different writing styles we have, the better the software will perform. We have access to the necessary client data so that enables handwriting recognition. And then, there’s deep learning.
Deep Learning Advancements
Deep learning is a variety of neural network that adds a number of what we call hidden layers in between the inputs and the outputs. All of these different hidden layers consist of different nodes that are trying to figure out things. When we apply deep learning to handwriting recognition, we can use a lot more information andwe can process a lot more of the information input into the system.
Deep learning enables more intensive computing to identifying what is right, and what is wrong with an amazing amount of detail, much more than with traditional neural networks. All because we have more data and more computing power.
Handwriting Recognition Today
The end result is significantly improved performance on handwriting recognition. Watch the video below to see this in action. We can extract handwritten information without having to do much to it – this was impossible just a few years ago. Now with applied deep learning, access to a broad range of information and much greater amounts of computing power, handwriting recognition has highly advanced.
This enables applications of handwriting recognition to business problems that can use it, which might have been off limits because it was too difficult to locate that type of context or it was just too expensive and time consuming to set up and configure.
Achieving High Quality Handwriting Recognition Results
Handwriting recognition just got a lot better. You do not need to spend a lot of time adding context or optimizing it. However, anytime you can specify to the software what you are really looking for, the performance is just going to shoot through the roof, but the baseline performance is truly incredible now. Parsing handwritten correspondence in prior authorization or claims adjudication, or within any type of core business process, Parascript can do that.