Frequently Asked Questions About IDP
by Sujith Parakkunnath, on February 23, 2022 11:45:00 AM PST
Intelligent Document Processing (IDP) is a relatively new segment of technologies aimed at intelligent automation. As we talk to prospects and answer their questions about IDP, we find some repetitive themes. In this post, we will look at some of the most frequently asked questions about IDP and answer them with our perspectives.
Q: Are IDP solutions API-based? Should they be API-based? What does it mean when an IDP solution is API-based or supports API output?
A: The way most IDP platforms work is that they receive a document as input, do their magic, and extract important business information from these documents. This information extracted at the end can be returned to the business systems in two broad ways - one for consumption by a human being and the second for consumption by another system.
For human consumption, IDP systems display this information in a user interface where the customer can verify the information and correct it if needed. This information can also be returned via a file exchanged over email or shared storage options like Google Drive, Dropbox, etc. That is the second means of returning this information.
For consumption by other systems instead of human beings, this information is returned via an API or Application Programming Interface. The API response can be readily consumed by other technologies like an ERP system or an RPA bot. So when you look at an IDP system, you should think about who is going to consume this information eventually. If that consumer is a system or a Bot, you will need an API response.
Q: Where are IDP solutions implemented? Cloud or On-Prem? If the vendor has the capabilities for both, how should I analyze what works best for my business?
A: Cloud and on-premise are the two main means of deploying IDP solutions. There is a third, lesser-known solution, desktop-based IDP applications. These applications are quite light in nature, do not use extensive machine learning, and can be installed independently on the user’s machine. More mature IDP platforms that can process complex documents will need a few servers - either on-premise or in the cloud. There are two main criteria to help you choose the best solution for your business:
1. Data Privacy: Due to the nature of the sensitive data that a business handles, sometimes it is not an option to let that data go out of the business’ control. In that case, businesses prefer an on-premise solution where they keep the data locally. The downside is that the on-premise solutions are costlier, do not always get the latest algorithmic enhancements, and need technical support staff to support it.
2. Cost: Since the cloud-hosted option is shared, it costs less. You also save costs by not investing in your own infrastructure or the technical support team. The downside is that your data will leave your system and be processed in the cloud.
Q: Does an IDP solution use OCR? Do they use proprietary OCR or do they have OCR Partners? As the OCR determines the quality of extraction, how do I get insights as a customer to the OCR engine capabilities of IDP vendors?
A: This is a great question. IDP solutions roughly use 5 technologies: Computer Vision, Predictive Analytics, Natural Language Processing, Machine Learning, and OCR. Most IDP solutions use multiple OCR engines. Some of them have their own OCR engine, some use third-party OCRs. Most IDP platform vendors will tell you what they use. But as a customer, you should focus on OCR performance rather than worrying about which OCR engine is being used. The reason is that even if an IDP platform uses third-party OCR, it might have done significant pre-and post-processing on OCR input and output. So, you will get a more accurate extraction compared to using that OCR engine directly.
Try your complex, low-quality documents on different IDP platforms and see which one handles your documents better. You should go with a better solution rather than a specific OCR engine.
Q: I did my research and did not find a perfect solution that meets my needs. Why is there not a perfect AI solution for IDP extraction?
A: There are two sides to this coin. Most IDP solutions excel at a particular type of document - structured, semi-structured, or unstructured. But on the other hand, businesses do not have just one type of document. Every business generally has a combination of all three types of documents. That is why most businesses find themselves in a situation where they pick a really good solution for forms, but then realize that it cannot handle semi-structured documents as well. It needs customization to handle these documents.
As we explained in our recent blog post regarding document types, the broader systems are easier to configure. If you find a system that can handle unstructured documents, it is not configured to handle structured documents like forms. But the reverse is not true. So, you should figure out what is the broadest category of documents that your business needs to process. If it is semi-structured documents, then pick an IDP platform that excels at semi-structured documents and configure it for other documents.
Q: I have a document, say an insurance document from which I want to extract printed demographic information and some key-value pairs from the handwritten text (not just detection but free form writing, such as a doctor’s prescription),. I have another document with clinical information that may be a single one with information about multiple patients. Would IDP be a good choice for me?
In fact, true IDP solutions are full platforms rather than just products. Their strength is in not solving one problem really well but rather handling all aspects of document data extraction. You can design a specific solution and workflow based on this platform that best suits your business. A comprehensive IDP solution needs some level of configuration along with out-of-the-box capabilities.
Q: What is the process for retraining? Are corrections captured and added to the ML learning cycle?
A: Machine learning systems are in some ways like Chinese bamboo trees. Just like with these trees you need to invest time upfront training a system on your specific documents. It tries to learn variations, understand layouts, and build a foundational capability to support the prediction wave that is about to come. But once it takes off, it gives you amazing accuracy.
Here is reference data from an actual live implementation of Infrrd’s IDP platform for a document that had thousands of variations. It was near impossible to create and manage templates for this type of document that did not follow a fixed format, and data was never in a fixed position on the document.
This is how the accuracy improved with each iteration of training:
By the time the customer started using the final version of the model, it had a very high accuracy rate.
While other traditional systems provided a 65% accuracy rate against the initial accuracy of 63% of this model, in a few weeks, it outperformed every other accuracy benchmark that the customer had experienced. It feeds off of the corrections made by the data processing teams as well as the new document layouts and variations that it had not processed before. When the system processes a document and even when the results are not corrected, it learns that it has performed the right extraction. This increase in accuracy is made possible by continuous relearning using our feedback loop.
Q: What is the average number of documents/training time/training set data required for a model to be effective? What is Infrrd’s process?
A: The answer to this question largely depends on the type of documents. For some documents, one month of data is enough to learn to get things started. For more complex documents, it may take 3 months of data. Infrrd’s approach is to get the customer started with what they have. How many documents you have isn’t a critical issue. The important thing is that every day or every week as you process documents, you see your accuracy numbers climbing.
You can start with zero documents on day one if you want. Though this will mean you will also start with zero % accuracy. This is not such a bad thing for customers who process all of their data manually. But they will see the manual effort going down every week and see the accuracy climbing.
Q: What can IDP models process out of the box and what configuration is possible? Can IDP models only process PDF documents? Can these models process data in an email message combined with information in a document? What about things like logos, symbols, and signatures?
A: Different types of IDP solutions can process different kinds of data. Infrrd’s IDP platform excels at processing semi and unstructured data. We handle most of the semi-structured data out of the box. Because of the broad nature of unstructured data, your specific implementation may need some customization on top of our platform.
Capabilities to handle visual data like tables, logos, signatures, and symbols are built into our platform. The unique value proposition of our platform is its ability to learn from each and every customer’s data. Our visual recognition engine can predict logos for one customer and predict shipping handling symbols for another.