Accuracy vs Confidence Score

By
Sweety Bajaj
Product Manager

Confidence Score vs Accuracy Score: Learn the Difference

Enterprises roll out an Intelligent Document Processing (IDP) system to either scale their operations or cut down on some costs. Irrespective of what they choose, they need to figure out what will help them achieve their goal. Should they aim for higher accuracy or a reliable confidence score? This is like The Matrix red pill / blue pill situation. Pick one and your world will change forever, or pick the other and go back to business as usual.

In our experience, dealing with some of the largest enterprises in financial services, the focus is usually on accuracy. Most companies think that if they get 95% accuracy, they are golden. Everyone believes that higher accuracy means lower costs, faster turnaround, and better scale for their business. While this is not entirely false, it does not address the complete picture.

Let’s take a closer look.

The problem with only using accuracy

On the surface, accuracy looks like a good measure. You would be hard-pressed to think how you could lose money if you get high accuracy. Let’s break this high-accuracy math down. Assume you have a data processing team that processes 10,000 data points or values every day. Without an IDP system, each and every one of these values is manually extracted by a person reading these documents. Let us also assume for the sake of simplification that they spend 1 minute to extract each of these values. The simple calculation shows:

10,000 X 1 = 10,000 minutes

Post IDP implementation, you end up with a system that processes data extraction for these values with 95% accuracy. This means that there are 500 values that are wrong and 9,500 values that are correct. Without an IDP system, you get your data processing team to look at all these 10,000 values to make sure they are correct.  With IDP processing they have to review 9,500 values and correct only 500 of them. Assume they take 30 seconds to review a value and 1 minute to correct it. Here is the new math:

(9,500 X 0.5) + (1 X 500) = 5,000 minutes

Hmmm, suddenly the high accuracy gain does not seem to look that high. You are down to 50% effort from 10,000 minutes to 5,000 minutes.

Here, 95% accuracy translated to only a 50% efficiency gain!

Introducing the Confidence Score

IDP systems get around this challenge by offering you a confidence score. A machine learning probability score that tells you how confident the underlying algorithm is that it has extracted the correct value. The trouble with the confidence score is that if it is not 100%, you cannot reliably decide whether you need to look at the extracted data or not. Sometimes the algorithm might give you a confidence score of 65% and be completely right. At other times, it might give you a confidence score of 95% and yet extract the wrong value.

Did you know? A confidence interval is essentially the range of values that are required to match the confidence score threshold value for an entire population. Confidence intervals are usually reported in the context of a margin of error.

But if you have a confidence score that is reliable - by reliable I mean ‘take-it-to-the-bank’ reliable, the system can then tell you when you do not need to look at an extracted value and use it as is. Let’s revisit that math again. This new IDP system is 85% accurate but gives you a reliability score of 50%. That means no one needs to look at the 70% values that it has extracted. Here is the new math:

Value Extracted: 10,000 X 85% = 8,500

Reliable Values: 8,500 X 70% = 5,950

Time spent: (2,050 X 0.5) + (1,500 X 1) = 2,525

This translates into an efficiency gain of almost 75%
with a system that has 10% lower accuracy!

There are two challenges with confidence scores

The first one is about awareness. Most customers put undue importance on the accuracy score and do not realize that unless it is 100%, they will need to look at quite a bit of data. This is a fundamental misunderstanding of how probability works. When the accuracy of a field is 90% then there is a 10% chance that the field is 100% incorrect. Unfortunately, there is no way to find out if that field is correct or not other than looking at it. That is why a high accuracy score of 90% does not necessarily mean that you will save 90% of your effort unless the algorithm can tell you which 90 documents out of the 100 are correct.

The second challenge is that most vendors cannot give a reliable, ‘take-it-to-the-bank’ confidence score.

The Infrrd Take

Why do most IDP solutions miss out on the reliability factor when it comes to confidence scores?

The primary reason is usually the level of expertise and cost involved in going deeper as compared to just settling with a character-level language model or word-level scoring.


This is a fairly complex problem and our research team has spent a lot of time on it. It is extremely difficult for most vendors to give this confidence and perhaps that is why they try to steer the conversation to accuracy more than the confidence level. Advancements in technology, and machine learning models in general, allow us to dish out confidence scores with (drum roll please) confidence.

Our patent-pending confidence score algorithm uses novel techniques to look at test data and multiple signals to give you a confidence score so good that you do not have to look at the data for verification. This translates to real cost savings and, more importantly, more time for your data processing teams. From the training data to correct predictions, from the precision-recall curve to true values, with Infrrd you can rest assured of surpassing the confidence score threshold rather easily. Besides, it saves you from the hassle of dealing with incorrect predictions, false positives as well as false negatives from the test dataset.

Whatever your use case may be, make it a point to invest in an IDP solution based on a machine learning model that digs deeper into the data and offers you reliability - the one that offers a ‘take-it-to-the-bank’ confidence score.

Frequently asked questions

What does your pricing model look like?

We price based on the annual volume of pages and complexity of document type.  We can get you preliminary pricing once we outlined a solution.  Let's do this.

To know more, book a 15-min session with an IDP expert

How can I try Infrrd before I commit to a full deployment?

Sure.  The first step is to schedule a guided demo where you get to jump into the thick of it.  After you explore our solution you can try a proof of concept. When you're ready, you can deploy the system to one use case.  Then more use cases.  Then across your enterprise.

To know more, book a 15-min session with an IDP expert

How does your system integrate with others in my enterprise?

We play nice.  Our solutions are API-based.  Your documents are feed into the solution using APIs. And extracted data is sent out through APIs.  We use REST APIs.

To know more, book a 15-min session with an IDP expert

Does your solution run in the cloud or on premise?

Our solution is cloud-native but is also design for premise deployments.  Your choice on how you want to deploy it.

To know more, book a 15-min session with an IDP expert

Does Infrrd run on mobile or desktop device?

Glad you asked.  Our data extraction process runs on servers.  We have found performance and accuracy decline when running on a desktop or mobile device. (Remember Infrrd is running a powerful AI stack).

To know more, book a 15-min session with an IDP expert

Does your system work out of the box or does it require training?

Common documents and use cases work out of the box.  The cool thing is your solution will improve as the system learns from your documents upfront and over time.

To know more, book a 15-min session with an IDP expert

How does your solution handle corrections?

Did you know no system is 100% accurate all the time?  When extraction errors occur you want to correct them.  We provide a simple UI that your business analyst will use to make corrections.

To know more, book a 15-min session with an IDP expert

Does your solution work with handwriting?

Our solution excels at data extraction from handwriting.  We've got proprietary methods and techniques that do the trick.  It's pretty cool.  See for yourself.

To know more, book a 15-min session with an IDP expert