IDP with Intelligent Table Extraction

By
Anusha Venkatesh
IDP Evangelist
January 27, 2022

If you are looking for an IDP solution, you are looking at extracting meaningful information from a large volume of documents. Most of these documents have tables, which is a structured presentation of information in rows and columns with underlying relationships among its elements and attributes.

Overview

Tables contain valuable information because tabular data allow for quick analyses as they are presented in a clean, structured, and concise format. Companies dealing with a large volume of documents may have to extract information from a large number of tables, perhaps millions, in structured or unstructured formats varying in structural semantics. This makes it more important than ever to choose an IDP solution that has a table extraction feature. However, you may need a solution that is not only high-value and effective but does more than extract information.

Challenges

Extracting information from tables has a unique set of challenges in IDP because of the heterogeneous nature of tables. Some of the key challenges in extracting tabular information are as follows:

  • Detecting the table region accurately
  • Detecting tables with multiple structures, layouts, and mostly different variations
  • Detecting the exact boundaries
  • Detecting the rows and columns and extracting information from them
  • Segmentation based on semantics
  • Identifying merged cells
  • Denoising blank cells and irrelevant content
  • Decoding the structural relationship of the table data

Infrrd’s Solution

Infrrd’s IDP solution already has a state-of-the-art, data-driven table extraction feature that uses AI-based technologies, such as deep learning, neural networks, and computer vision, to build machine learning algorithms for table detection and extraction. Not only is this process accurate and effective but also cognitive in nature, in handling multiple, diverse variations - an end-to-end trained model to address real-world scenarios. Moreover, the cognitive capabilities of Infrrd’s solution ensure that the trained data set for each extraction transforms to an exponential increase in accuracy and quality for future extractions.

A key benefit of Infrrd’s table extraction feature is that you can extract similar columns from multiple document types or variations. For example, assume you have received different documents from different vendors. You want to retrieve the item name, description, and total amount from these documents, whether it’s a utility bill, an invoice document, or a McDonald’s receipt. The trained Infrrd model can retrieve the same columns from all these documents using the table extraction feature. The advantage here is that our table extraction feature is applied globally to all document types. If the document you upload has a table, then it will be detected whether it’s an invoice, a receipt, an insurance form, or a loan document.

When you upload a document to Infrrd’s IDP platform, the system first checks whether the document contains tables. It then identifies the boundaries of each table and also the location and coordinates of each row and column. Next, it identifies and maps the headers for relevant columns. Finally, the tables are automatically extracted and relevant fields are mapped.

Fields are automatically recognized by the machine learning algorithms based on your requirements. You can play around with this table to correct any deviations from your prerequisites. Some of the powerful table-extraction capabilities of our solution are:

  • Insert: Few rows or columns can be added to a table. For example, if the IDP system automatically extracted the item name and vat rate but you also need the quantity, you have an option to select and tag this column to the table.
  • Redesign: The table border can be recaptured to include only the fields you need.
  • Reorder: The position of the columns can be reordered by just selecting the table and dragging the respective column.
  • Delete row/column: Specific columns and rows can be deleted if they don’t have data or you don’t need them.
  • Delete table: If the table detection was incorrect or you want another table you can delete the current table and redraw the boundaries.
  • Borderless: Our algorithms will detect tables whether they have borders or not.
  • Combine: If the table is extended to multiple pages of the document it is combined and extracted into a single table.

Infrrd’s table extraction feature enhances the capabilities and user experience for your IDP solution.

Schedule a table extraction demo

Frequently asked questions

What does your pricing model look like?

We price based on the annual volume of pages and complexity of document type.  We can get you preliminary pricing once we outlined a solution.  Let's do this.

To know more, book a 15-min session with an IDP expert

How can I try Infrrd before I commit to a full deployment?

Sure.  The first step is to schedule a guided demo where you get to jump into the thick of it.  After you explore our solution you can try a proof of concept. When you're ready, you can deploy the system to one use case.  Then more use cases.  Then across your enterprise.

To know more, book a 15-min session with an IDP expert

How does your system integrate with others in my enterprise?

We play nice.  Our solutions are API-based.  Your documents are feed into the solution using APIs. And extracted data is sent out through APIs.  We use REST APIs.

To know more, book a 15-min session with an IDP expert

Does your solution run in the cloud or on premise?

Our solution is cloud-native but is also design for premise deployments.  Your choice on how you want to deploy it.

To know more, book a 15-min session with an IDP expert

Does Infrrd run on mobile or desktop device?

Glad you asked.  Our data extraction process runs on servers.  We have found performance and accuracy decline when running on a desktop or mobile device. (Remember Infrrd is running a powerful AI stack).

To know more, book a 15-min session with an IDP expert

Does your system work out of the box or does it require training?

Common documents and use cases work out of the box.  The cool thing is your solution will improve as the system learns from your documents upfront and over time.

To know more, book a 15-min session with an IDP expert

How does your solution handle corrections?

Did you know no system is 100% accurate all the time?  When extraction errors occur you want to correct them.  We provide a simple UI that your business analyst will use to make corrections.

To know more, book a 15-min session with an IDP expert

Does your solution work with handwriting?

Our solution excels at data extraction from handwriting.  We've got proprietary methods and techniques that do the trick.  It's pretty cool.  See for yourself.

To know more, book a 15-min session with an IDP expert