Most organizations presume that the foundation of document automation relies upon optical character recognition (OCR). This is largely due to the fact that most documents are text-based and therefore, the primary methods available for automation tasks such as document classification and data entry require OCR (among other techniques).
But the perspective that document automation is limited to text-based information is very limited and excludes a lot of opportunities to improve processes while lowering associated costs. Take, for instance, the process of auditing loan documentation. While a lot of attention for automation lies within origination, there are a number of processes that occur during or right before and after a loan is funded and closes. Loan auditing involves a number of activities the least of which is simply verifying that all expected supporting documentation is present. And, as with document automation within origination, there is a need to peek into each document to verify consistency of text-based data such as loan value, interest rates, and property addresses.
Yet there is still a type of information that does not lend itself to use of OCR. Among this information includes signatures, initials, and notary stamps. Staff performing audits must page through loan files and verify that signatures, initials for all borrowers, and completed notary stamps are present and where they are expected. But software that only uses OCR cannot possibly perform these tasks. Some solutions may get creative by attempting to determine if a part of a page has something in it by checking for a certain density of pixels, but this approach leads to a lot of error forcing staff to check the software’s results.
But what if an organization, like a title insurance company, could receive a loan file, and quickly and automatically list-out all documents present and then produce a report that summarizes all key data required for each document? Even non-text data?
This is where application of computer vision meets deep learning neural networks to teach software to “see” like a human, except at a fraction of the time and with much greater precision. For instance, to detect signatures, software is trained to understand what a signature looks like compared to other types of data. Signatures are unlike handwriting in that they are almost hieroglyphic in nature – most times a person cannot read them, but they do know what signatures look like. After training on hundreds of thousands of examples, software such as FormXtra.AI can reliably locate signatures anywhere in a document. So to for initials and since initialing is closer to handprint and cursive, the software can even be trained to transcribe initials into letters and even distinguish between initials for multiple co-signers.
When it comes to notary stamps, we enter an even more complex yet solvable problem: how to evaluate and note the presence of a notary stamp anywhere on a page and in any orientation. And then there is the task of reading a handwritten date and noting the presence of a name and signature of the notary. While these tasks are easily solved by humans, systems can often be confused by even the slightest variation. But with the right set of machine learning techniques, a high level of automation can be achieved.
Moving to an even more-advanced level, signatures within a document can be compared to one another to verify that the same signer was involved for all documentation.
Once combined with typical text-based data extraction, organizations involved with lending from the originator to the servicer can enjoy a high level of automation for a wide range of processes, even while raising the standard levels of accuracy. More with less. And you can do it on all the information on a document.