Don’t just OCR documents. Interpret them. Document interpretation is central to cognitive document automation for the digital workforce. The shortcomings of treating a document automation project as simply an “OCR problem” and the limitations of OCR when the ultimate objective is achieving unattended document automation have become much more clear over time. We’ve even written an ebook (Building Bespoke Document Automation) about it. The focus of this article is “interpretation” cognitive document automation and what that really means.
When we talk about document automation or advanced capture, what we mean is the process that takes documents and converts the information held within them into useful information for organizations within many different business processes. Information is converted regardless of whether they are images of documents captured by a scanner, by mobile device or if they are “digitally-born” documents such as Word documents, emails or PDFs. Typically, “useful” is a substitute for the word “structured data.” This is data that can easily be ingested and processed by
other automation and business systems. So for a tax form, document automation is the process of locating specific entries of data and presenting them to another system. For more complex documents such as commercial invoices, it is the process of locating specific data about the transaction and presenting it to an accounting or ERP system. For some of the most complex documents—such as contracts—it is the process of identifying specific terms of the contract and presenting these terms to another system. OCR software cannot do this type of interpretation. OCR tools merely convert image-based documents into machine readable text. In each of these cases, the focus is on the ability to reliably and efficiently locate and extract specific, needed data.
In cognitive computing, the expectations go well beyond getting to this data in a reliable and efficient way. Typically, it also means that the data travels straight through the process with no manual intervention whatsoever, which is increasingly referred to as “unattended automation.”
As you might imagine, the use of OCR is only a small fraction of the tasks involved. With increasing frequency, OCR is not needed at all. For born-digital documents, there is no OCR as the information contained is already machine-readable text. For many document classification tasks, use of visual analysis alone is suitable to get the job done. Again, this is without the use of OCR software.
Unattended Automation for Data Location and Extraction
For data location tasks, the effort involved runs the gamut from employing seemingly easy fixed zones/templates for locating data to complex document structure analysis for identifying headings, paragraphs, sentences and words. Complex document structure analysis enables reliable identification of data like the list of causes for termination of a contract. Not surprisingly, as a result, the efforts to reliably locate and extract document-based data have become a key focus of advanced capture vendors. This allows for the core OCR technologies to be considered practically a commodity. Most advanced capture vendors don’t even develop their own OCR!
Back to that expectation of unattended automation: this is the ability to process documents where the majority of needed information is never reviewed by human staff with greater accuracy than manual processes. How is that achieved for document-based information?
Advanced Capture: Technology Stack
If we examine the “technology stack” involved with advanced capture, it starts with image perfection (again, not OCR), moves to OCR in selected cases (remember it isn’t always about images), and then progresses to the interpretation stage where required data is located within documents. There is another crucial step in this interpretation stage before you get to the extraction/presentation stage: data analysis. This component is responsible for evaluating the located data to determine whether it is reliable. In order to do this, a variety of alternatives are considered along with other information about the data (called “context”) to arrive at a conclusion. The best systems employ several different evaluation techniques in order to create consistent reliability outcomes. There is nothing at all within any OCR process or regular expression that will enable this capability. Only this capability can enable true unattended automation for today’s digital workforce-based processes.
Digital Workforce-based Processes
The unfortunate reality is that a large number of organizations have implemented advanced capture, whether home-grown or commercial solutions that require 100% verification of data. That is not the definition of unattended automation.
Parascript has been using these interpretation and analysis processes within document automation for over 25 years achieving true unattended automation that saves organizations literally billions of dollars per year. We continue to push the automation envelope. Today, we enable true automation with Smart Learning which takes the data science approach necessary to achieve high levels of automation and puts it into the hands of the organizations themselves. It’s time to put the “advanced” into your document capture processes.
If you found this article interesting you may find this eBook useful. The Building Bespoke Document Automation eBook examines the types of document variance & options for automation.