…And Makes Your Life Easier
While it may seem counter-intuitive due to the relative volume of scanned documents vs. documents “born digital”, advanced image processing for scanned documents is now more important than ever. Why? Because even as businesses increasingly adopt digital transformation technologies, documents that start as paper and are transformed into images still account for a significant portion of necessary information within the “first mile” of customer interaction. And for traditional document automation of business and transaction documents such as claims and invoices, a significant amount are still received as images from multiple input channels, whether it be email, fax, or even mobile. Put in another way, the point of capture is moving further to the edge in order to support customer’s changing preferences and to support more-complex process automation at a more-efficient level. And this means more images with more variance.
Most of the time, these “multi-channel” inputs are treated as exceptions within current automation workflows, primarily due to the large variance in image quality. Variances in quality can be caused by differences in scanner optical quality, differences in fax machine settings, and with mobile, issues with proper lighting, focus, and contrast. So how do organizations simultaneously support all of these “edge input channels” without increasing their own processing costs? Through better image analytics processing. Let’s look at two current-day examples.
Mobile-captured Images
Take, for instance, an image of a document captured with a mobile device. While organizations with strong customer relationships, such as banks, can incentivize their customers to use special mobile apps that can optimize image quality, millions of customer interactions from a variety of businesses cannot do the same. The result is that these organizations receive mobile-captured images of documents that vary greatly in quality. To make it worse, these images rarely provide accurate metadata regarding actual resolution (most report a resolution of 72 DPI). Examples are all over from mortgage-related information requests such as proof of income, identification, bank statements, etc. All of these require manual processing.
Data-dense Documents
Another example are the tens of millions of medical claims that are received already scanned as black-and-white images and are captured with a variety of imaging devices. These forms are so “data dense” and vary so widely in quality that they are almost always handled by manual processes – automation is just not possible without being able to adapt to the images.
What is the Answer?
This is where advanced image analytics comes-in. For mobile-captured pictures, it first starts with deriving the correct resolution. Doing this allows follow-on processing to adjust appropriately to the real size of the image. If systems do not know the real resolution, images will not be processed using the correct dimensions resulting in failed classification or data extraction. Once the resolution is determined, images can be normalized to a specific size in order to treat all documents using the same automated classification and extraction rules. The process continues by converting the image to black/white in order to provide the highest level of contrast between the data and anything else on the image – commonly referred to as “noise.” This noise can exist as watermarks, shadows, and backgrounds.
Once there is a “clean” image, data can be located and extracted with almost as much precision as if the image were captured with a high-quality scanner. For forms processing, a completely new level of noise removal can occur; what we call “virtual drop-out.” This capability can actually located and remove the underlying form’s pre-printed text and structure (boxes, combs, etc.) at a field level, producing an image that resembles the more-expensive drop-out ink variety of forms.
All of this can occur without the need for a special mobile app or special workflows.
The result is the ability to process in a single stream, documents that arrive from any input source and with any level of image quality without increasing complexities and cost.
If you found this article interesting, you may also find this advanced imaging demonstration useful: