The amount of data extraction (i.e., document recognition) automation isn’t the most important aspect of a document capture project if you are neglecting other benefits such as improved process control and efficiency, according to an article that I recently read. This is a good point and one that is made clear every time I visit a client’s document preparation and data entry operations.
Beyond Unrealistic Expectations
Many companies could certainly use technology to help shepherd the various tasks within any document processing activity via business process management, workload balancing and productivity reporting. None of these things require any sophisticated data extraction technologies or the associated significant efforts required to fine-tune extraction results. Companies can enjoy efficiencies gained through improved throughput and improved knowledge about where process bottlenecks occur so that they can quickly address them. And yet, making the argument that a company can benefit simply through improved business processes is a cop-out of sorts to the increased and often unrealistic customer expectations that document processing technologies such as recognition should always successfully extract 90+ percent of the necessary information with 99.9 percent accuracy.
We are now seeing this same set of unrealistic expectations shift to automated classification of documents. Vendors are more than happy to try and throw new technologies at the problem in hopes that something will improve, but the reality is that there is no technology out there for document classification or data extraction that will provide such high levels of success. Not even sophisticated deep learning software like IBM’s Watson achieves these results.
What’s Important? Accuracy or Efficiency
The answer is both are important. Efficiency is often a product of accuracy more so than the automation of business workflows. And then, there is the efficiency that can be gained through the automation of manual document-oriented processes. Take for instance an organization that has a staff of ten to manage sorting and separation of incoming business documents. Employing scanning and a document preparation workflow certainly reduces unnecessary slowdowns or stoppages. However, true efficiency gains of 50 percent or more are only possible to achieve by automating some of the actual activities that these staff members perform. With document auto-classification, the activities of assigning and sorting documents can often be reduced by 50 percent or greater with some basic tuning. With this reduction of workload, existing staff can be assigned to handle the exceptions and quality control that are critical to reduce errors, but so often neglected due to the lack of additional resources. Notice however, that I did not make an argument for reducing workload by 90 percent or more—50 percent is often enough to gain substantial improvements. The key here is that the accuracy of the auto-classification engine output must be extremely high—close to or better than what a person can provide. That accuracy only needs to cover 50 percent of the incoming documents to achieve significant automation and cost savings.
Start Small: Yield Big Savings
The accuracy of document classification and data extraction/recognition is extremely important and project success only necessitates applying this accuracy to a percentage of your document-based information. While it should always be an objective to increase the percentage of documents that are automated, starting with lower percentages will still yield the types of savings, efficiencies, and improvements that most organizations are clamoring for today.