How to Use Repeatability to Gain 120% Data Processing Efficiency
by Sujith Parakkunnath, on August 30, 2021 7:40:52 AM PDT
This past month, the Infrrd team collaborated with our client, a large financial institution that has a team of about 100 people extracting data from documents manually. This customer processes a huge variety of documents from numerous sources that do not have any fixed format. Every document from every provider comes with its own unique layout and vocabulary of information. These documents arrive at the firm’s mail room where they are combined into packets of hundreds of documents then get handed-off to the data processing team. In all, the customer has millions of combinations of layouts from which the team extracts data.
As with all our customers, we were tasked with the job of making this team faster and more efficient. Before getting to automated document processing, we spent some time analyzing how this team worked. Every member of the team started going through these documents one by one. All of the documents looked completely different and it took them a couple of seconds to orient themselves to this new layout.
To quantify our understanding a little more, we asked three of these users to start processing the documents within our system. This is how their processing data looked like at the end of four hours:
The efficiency of the users was pretty low to begin with. But one interesting observation was that every user started slow and started getting more efficient with each passing hour. At the end of the day, our machine learning algorithms kicked in and started learning from the corrections made by the users in the day. The next day, we repeated this process and this is how the numbers looked:
With the improved accuracy, the users processed 10% more documents in the same time frame. This was a small but good incremental improvement. The next day, the machine learning algorithms learned again from the corrections. But this time, we also made a change to how the documents were routed to the users. Instead of showing them documents one after the other, we organized them so that similar looking documents were shown one after another. Here is the data from third day’s run:
Almost 120% gain in efficiency!
Some part of this gain is related to re-learning of machine learning algorithms but the bigger gain is from repeatability. This is a fundamental issue that most data processing teams overlook. Besides using AI to process documents end to end, there are a lot of innovative ways you can employ it to optimize your entire business process.
When you begin looking at technology to help you optimize your data extraction process, you should take a step back and try to re-visualize your entire business process. Pay attention to how your team is structured today, what your manual data entry does not do today that can be automated and most importantly, where are new opportunities to introduce repeatability.