A lot of attention for Intelligent Document Automation or IDP is in the area of providing structured data from unstructured document-based information in support of business processes. For example, in accounts payable scenarios, IDP can automate the data entry of invoice, purchase order, and remittance data into accounting systems of record, removing a lot of drudgery and time of data entry from higher-paid accounting staff.
Intelligent Document Processing powered by artificial intelligence also holds great promise for data accessibility in insurance. Use of IDP capabilities in the service of other tasks includes predictive and risk analysis. According to Chilmark in their report, The Promise of AI & ML in Healthcare, “substantial work in curating, labeling, and cleaning data is required to make datasets market-ready for healthcare applications. Nearly 80% of the work to develop and test an AI/ML algorithm is preparing health data for use in training algorithms.” This is one area where IDP can provide substantial benefits.
Predictive and Risk Analysis for Healthcare
For instance, Parascript participated in a project orchestrated by the Heritage Provider Network to analyze claims-related data in order to predict the need for re-hospitalization. The focus was to identify the patients most at-risk for the need to be re-hospitalized and proactively provide them with more care resulting in improved overall health and reduced costs. While the analyzed data was in the form of a structured database, the core information is typically compiled manually from tens of thousands of claims case data, many of which are stored in the form of claims documents. IDP software provides a cost-effective ability to access this claims data in order to support efforts all over the US to increase the standard of care and improve outcomes while, at the same time, reducing costs.
Hidden Patterns
Another case where IDP supports low-cost access to complex, document-based information for use in risk analysis is with patient medical charts. Medical charts represent a treasure trove of unstructured data which can be parsed and analyzed using Natural Linguae Processing (NLP) in order to identify hidden patterns. While EHR/EMR-based data is available, a lot of useful qualitative data is stored within the complex structure of medical charts, each made-up of different medical records that rarely have needed data, which is uniformly located in the same position. Additionally, analysis of data in medical charts requires treating each medical record as a separate entity. However, many charts exist as single monolithic PDF files containing many discrete records. So not only is there a challenge with accessing data from each record, but there is the challenge of separating each chart file into individual records.
Using Machine Learning Algorithms
Again, this is where IDP can help. Using machine learning algorithms, IDP software can be trained on medical charts to identify key characteristics of each individual medical record. Different algorithms are often employed that evaluate various attributes such as presence of graphical information (e.g., logos), textual data (e.g., facility names and addresses), and even spatial information such as the distance between different dates on a page, and use of specific language related to those dates. All of these attributes are then analyzed to identify the most reliable way to identify and separate one record from another. Once separated, data extraction can then be employed with the final step being to employ NLP to further identify specific patterns in the text that can reveal various problems and conditions not only with a single patient, but across a patient population.
There is a great amount of value in using IDP for key data preparation activities that are then used within downstream ML-powered analysis.
So while a lot of attention of IDP is placed on data extraction that is used in processes, there is a great amount of value in using IDP for key data preparation activities that are then used within downstream ML-powered analysis. The result is quicker, more efficient access to data used in training and using machine learning for improving processes, whether those processes exist in healthcare, insurance, or finance.
###
Find out more about Parascript claims data extraction.