Increasingly organizations looking to acquire automation solutions employ more sophisticated processes including what is typically called a “proof of concept” or PoC. The rationale behind using a PoC as an additional method of evaluation is that technology solutions are often too complex to assess simply by comparing a list of features, the product documentation or a vendor’s response to a checklist.
Technology Solutions and Their Features
Rarely are technology solutions purchased for their features alone. Looking at this problem from Clayton Christensen’s “Jobs to be Done” angle, solutions are not acquired because they have a fancy user interface or an expansive API. They are purchased to solve a problem. You don’t buy a lawnmower because of a feature list. You get it because you want a nice-looking lawn. You may, in the process, not want to pollute the environment or be too taxed so you get an electric, self-propelled mower. Or, you might just hire someone to mow your lawn.
While it is reasonably possible to evaluate options based upon their attributes, technology solutions within the area of automation are incredibly difficult to assess in a side-by-side comparison based on a list of capabilities. Enter the PoC. Use of PoCs to evaluate Intelligent Document Processing solutions are gaining popularity for good reason.
The true value and purpose of Intelligent Document Processing (IDP or intelligent capture) solutions is with their ability to comprehensively and accurately convert unstructured, document-based information into structured data. That data is, in turn, used to fulfill another need. In fact, IDP solutions are always implemented to aid with solving a larger problem. The challenge with these solutions is that they can be very complex to learn.
Especially difficult is the challenge of learning any single solution to the degree necessary to truly deliver an optimized solution. So clever organizations are asking the solution vendors themselves to deliver the goods. This makes a lot of sense, as no one knows the software better than the company that created it.
But is handing-over the role to the vendor of configuring the software for evaluation a fool-proof approach? The answer is “no.” However, with the right preparation, a PoC is arguably the most assured method to truly compare options in an apples-to-apples manner.
It’s All in the Data
In the world of IDP, it is all about precision of data and, therefore, the sample sets that you provide to each vendor for configuration and the sets you use to verify the performance are critical. Generally, the more diverse a sample set in terms of both document type and document layout the better it will be at revealing a system’s true abilities. This is because the higher the variance, the more difficult it is for the system to reliably automate document tasks such as identifying documents and locating data within each type.
If a system can be configured within a relatively short timeframe (think in terms of a few weeks) and deliver good performance on an unseen test set, then it is likely to produce a similar amount of performance on real-world data. Simply put, the vendor — having to deal with a larger amount of variance — cannot easily “fake it” to produce good results.
A Cautionary Tale
While we certainly hope that you will not encounter situations in which a selected vendor misrepresents the abilities of its solutions, this does occur. In one case, an organization constructed a PoC that they believed would provide a solid evaluation of different solution capabilities. They provided a sample set used for configuration and set aside another set that was used solely to verify the performance. The vendors only received one set.
However, neither set included enough variance to really stress test the system. Due to the relative lack of variance, vendors could choose to employ tactics that produced good results on limited data sets, but do not work well in production when real variance is encountered. After what was perceived to be careful vetting, a vendor was selected and the company spent several months working with the vendor on real production data only to find that the solution was not up to the challenge.
The upshot is that you can go through the motions of a supposedly solid PoC and still not get what you want. And waste months in the process. In today’s era of machine learning, solid and representative data has become even more critical to both the vetting of potential solutions as well as successful implementations. If you are spending more time on vetting features and not on careful curation of data, then prepare for the worst.