I recently attended an AIIM webinar titled “Capture Anywhere-to-Process: The Need for Auto-Classification.” If you’re a member you can catch the recording at this link.
In it, Seth Maislin of Earley& Associates made a very good case for the value of automated classification for the purposes of streamlining multi-channel and multi-format document capture to get the needed data into a business process. The biggest message is that human-based classification just can’t keep up with the growing mountain of document-based information. Part of the answer involves focusing on the most-valuable information and using automation to ease the burden.
It appears that there are also two camps when it comes to auto-classification: those that love the “black-box” function that lets the system figure it out and the other side that wants to be able to peer into that system and understand the underlying rules and make changes or additions if needed.
The reality is that the best implementations will marry the two, taking the best of the black box to accelerate a project, and using rules to interactively and intermittently refine the core classification results.
What I loved the most is the sample that Seth used: a hand printed form used for health and emergency information at his daughter’s school.
How many applications have you completed by hand? I know that every year, even though alternatives such as eforms are out there, I have had to complete no less than 20 forms, mostly for government or local services related needs. Yet these forms are largely in “dark data” land when it comes to access to this hand written data.
At Parascript, we are currently working on this classification challenge for handwritten information. Even more, we’re also tackling the broader challenge of merging the black box method with the rules-oriented techniques to create the best of both worlds without the traditional trade-offs. Stay tuned for more.
Learn more about capturing dark data: