Don’t call it “OCR” anymore. Intelligent Document Processing (IDP) or what analyst firm Deep Analysis calls “cognitive capture” is something well beyond the traditional approach of applying brute force OCR on documents in order to create searchable content. In fact, increasing OCR is not needed at all with more and more documents born digital.
All of this begs the question: if IDP software is not the same as the OCR-focused document capture of the past and it promises greater performance, are we missing something? Are organizations even ready to benefit from it? In order to answer those questions, we need to focus on two main differences between document capture and IDP software:
- Increased use of machine learning
- Ability to make sense of documents beyond forms
Let’s take these in reverse order.
The Evolution of Document Capture Expanded Document Support
Looking back on the evolution of document capture, it all started with the simple process of turning scanned documents into searchable data. To do this, software converted scanned text into computer text via OCR and it operated on the page level, converting every page. There was also the option to allow adding index data manually, all in service of making documents easier to find. Then companies figured out that they could use this same software to automate business processes that used forms. Instead of applying OCR on every word on each page, a “template” could be created that told the software where to apply OCR. Form data like names, addresses, and other key data would be located and extracted into a business process. Data from insurance claims forms could go directly into a workflow. This worked very well given the high degree of standardization – you only needed to create a few templates to cover the various form layouts. Easy peasy. In fact, it worked so well that enterprising organizations started looking around at other document-intensive processes that went beyond forms.
But then it got very difficult. You see, unlike a standard form, other document-based information isn’t nearly as uniform. Commonly referred to as “semi-structured documents” needed data might be in a hundred different places and in several different formats, meaning that the creation of a hundred or more templates was required. Still, for processes with a high number of documents, spending several hundred hours creating and optimizing templates for each variant was worth it. But the progression stopped – other business processes that could benefit didn’t have enough document volume to justify the expense of configuration.
To overcome the rigidity of templates, software vendors got creative, introducing more tolerant, flexible rules-based approaches. Instead of drawing zones around fields to create templates, fields were located by the labels or keywords themselves. A remittance date value could be found by using the label “Date:”. A purchase order number could be found by using “PO #”. Rules could also solve the challenge of identifying documents automatically. Instead of manually sorting documents and using separator pages or barcodes, keywords could be used to identify a remittance from a fax cover page and so on.
While a rules-based approach meant that organizations no longer had to create and manage hundreds of templates and reduce manual preparation, this technique did require a significant amount of analysis of a lot of sample data. And the complexity grew significantly as simple templates were replaced by more complex rules that could include coding regular expressions. To get around the complexity, hiring professional services to do all this work became the norm. Some advances, like the ability to create knowledgebases of rules based upon user feedback were made in order to make things simpler, but overall, the systems weren’t spared from this additional complexity. Again, adoption within organizations slowed. Organizations were content to apply document capture to the most expensive, least complex processes.
So the current state of document capture to this day is mostly a limited adoption of forms processing and some high-volume processing of semi-structured documents, mostly invoices. In maintenance mode – like those old COBOL programs. And if changes are needed, most organizations rely heavily on a cadre of professional services staff to do that work – becoming experts in these systems is a bridge too far. Something has to change because documents are not going away.
Part 2 | Rise of IDP and Machine Learning
For the better part of a decade, the capabilities of document capture software did not advance much. Then machine learning came to the rescue. Machine learning promised to not only expand the range of documents but also to dramatically open the market to any process with documents. Read part 2 now.