Understanding IDP: Data Integration

By
Anusha Venkatesh
IDP Evangelist
January 5, 2022

According to Gartner, "The market for document capture, extraction, and processing is highly fragmented. Data and analytics leaders should use this research to understand the process flow and differentiated capabilities offered by intelligent document processing solutions". Gartner's recently released "Infographic: Understand Intelligent Document Processing" covers these 6 critical flows in IDP.

1. Capture or Ingestion

2. Document Preprocessing

3. Document Classification

4. Data Extraction

5. Data Validation and Feedback Loop

6. Integration

Source: Gartner, Infographic: Understand Intelligent Document Processing, Shubhangi Vashisth et al., 22 September 2021

This is the fifth and final post in the series where we explore Integration. Check out our earlier posts in this series, Capture and Preprocessing, Document Classification, Data Extraction, and Validation and Feedback Loop.

Meaningful data offers the best benefits when they are integrated with your business or enterprise systems, be it your on-premise or cloud system, or any incredibly complex system, such as an ERP. Today, businesses are focused on formulating comprehensive solutions for constantly-evolving customer problems or needs, and it is important to have an integrated system to ensure greater efficiency and business effectiveness

Why Integration?

When it comes to Business Intelligence (BI) & Analytics, unstructured data has been kept outside of data mining for the longest time. If you run a retail clothing store, when you sell a dress, you record its sale, you capture details like selling price, payment method, discount, tax, etc but you do not record how the dress looked. Did it have half sleeves or full, what kind of neck design it had. All of this information is potentially in the photo of the dress. This limits you from understanding your customer behavior. Questions like what percentage of people who buy faded blue jeans pair it with belts featuring over-sized buckles.

In the absence of a system that can make sense of unstructured data, it was always kept outside the realm of BI and Analytics. Structured data, like your sales record, also happens to be a small fraction of the overall data that you have access to. The majority of data that any organization deals with is unstructured data such as emails, documents, receipts, and photos. Now that IDP platforms can convert this unstructured data into structured data, it opens up exciting new avenues of understanding your customers and their behavior better through data mining.

Here are a few examples:
  • From a receipt of other stores that you do not own, you can now figure out if people who buy a beer also buy wine. If you find they do, you could run a promotion selling them together.
  • From payslips in mortgage application documents, you can figure out that most people who work for sales in the manufacturing industry usually get only X% of their sales commissions.
  • From supporting insurance claim documents, you can automatically figure out what percent of a car repair cost is from body shop work vs replacement parts for a Toyota Prius serviced in Chicago.

You can take this analysis one step further by opening up your extracted data to search using Natural Language Query (NLQ) technologies. So, instead of setting up reports in advance, you can fire a query in natural language. If we had an automated assistant, you could ask, “How many mortgage applications did we receive for homes in the bay area yesterday?” And you would get the right answer.

Typical Architecture

A typical IDP integration architecture is as follows:

Integration Features

Some of the common features to check out in an IDP platform to evaluate their integration capabilities are as follows:

  • No code platform
    Plug and play or drag and drop options to connect upstream and downstream applications.
  • Question platform
    Option for sales and marketing team to ask any dynamic questions and get answers on the fly.
  • Multi-platform Integrations
    Support to raise queries from multiple platforms.
  • Data Synchronization
    Option to automatically synchronize the latest changes from third-party platforms.
  • UI configurations
    Options for users to configure integrations or data sources from the user interface.
  • Robotic Assistants
    Routine functions handled by robotic assistants (bots). Sometimes, even make decisions to ensure increased accuracy through STP.
  • Analytics
    Integration provides you an opportunity to have a holistic Analytics dashboard to evaluate the performance.

Integration methods

Some of the common methods used for IDP integration with third-party solutions are as follows:

  • API
    This is one of the most common code-based methods where multiple systems are connected through Application Programming Interfaces (APIs).
  • Webhooks
    Similar to APIs, webhooks can be considered lightweight APIs for sharing real-time information among applications.
  • Orchestration
    This is one of the effective integration methods where there are ambiguities or variations, such as the availability of semi-structured or unstructured data. It primarily focuses on automating a series of tasks to ensure seamless integration.

Here is a table that depicts the industry-relevant integration features and Infrrd’s capabilities:

Frequently asked questions

What does your pricing model look like?

We price based on the annual volume of pages and complexity of document type.  We can get you preliminary pricing once we outlined a solution.  Let's do this.

To know more, book a 15-min session with an IDP expert

How can I try Infrrd before I commit to a full deployment?

Sure.  The first step is to schedule a guided demo where you get to jump into the thick of it.  After you explore our solution you can try a proof of concept. When you're ready, you can deploy the system to one use case.  Then more use cases.  Then across your enterprise.

To know more, book a 15-min session with an IDP expert

How does your system integrate with others in my enterprise?

We play nice.  Our solutions are API-based.  Your documents are feed into the solution using APIs. And extracted data is sent out through APIs.  We use REST APIs.

To know more, book a 15-min session with an IDP expert

Does your solution run in the cloud or on premise?

Our solution is cloud-native but is also design for premise deployments.  Your choice on how you want to deploy it.

To know more, book a 15-min session with an IDP expert

Does Infrrd run on mobile or desktop device?

Glad you asked.  Our data extraction process runs on servers.  We have found performance and accuracy decline when running on a desktop or mobile device. (Remember Infrrd is running a powerful AI stack).

To know more, book a 15-min session with an IDP expert

Does your system work out of the box or does it require training?

Common documents and use cases work out of the box.  The cool thing is your solution will improve as the system learns from your documents upfront and over time.

To know more, book a 15-min session with an IDP expert

How does your solution handle corrections?

Did you know no system is 100% accurate all the time?  When extraction errors occur you want to correct them.  We provide a simple UI that your business analyst will use to make corrections.

To know more, book a 15-min session with an IDP expert

Does your solution work with handwriting?

Our solution excels at data extraction from handwriting.  We've got proprietary methods and techniques that do the trick.  It's pretty cool.  See for yourself.

To know more, book a 15-min session with an IDP expert