Confidence Score vs Accuracy Score: Learn the Difference
Enterprises roll out an Intelligent Document Processing (IDP) system to either scale their operations or cut down on some costs. Irrespective of what they choose, they need to figure out what will help them achieve their goal. Should they aim for higher accuracy or a reliable confidence score? This is like The Matrix red pill / blue pill situation. Pick one and your world will change forever, or pick the other and go back to business as usual.
In our experience, dealing with some of the largest enterprises in financial services, the focus is usually on accuracy. Most companies think that if they get 95% accuracy, they are golden. Everyone believes that higher accuracy means lower costs, faster turnaround, and better scale for their business. While this is not entirely false, it does not address the complete picture.
Let’s take a closer look.
The problem with only using accuracy
On the surface, accuracy looks like a good measure. You would be hard-pressed to think how you could lose money if you get high accuracy. Let’s break this high-accuracy math down. Assume you have a data processing team that processes 10,000 data points or values every day. Without an IDP system, each and every one of these values is manually extracted by a person reading these documents. Let us also assume for the sake of simplification that they spend 1 minute to extract each of these values. The simple calculation shows:
10,000 X 1 = 10,000 minutes
Post IDP implementation, you end up with a system that processes data extraction for these values with 95% accuracy. This means that there are 500 values that are wrong and 9,500 values that are correct. Without an IDP system, you get your data processing team to look at all these 10,000 values to make sure they are correct. With IDP processing they have to review 9,500 values and correct only 500 of them. Assume they take 30 seconds to review a value and 1 minute to correct it. Here is the new math:
(9,500 X 0.5) + (1 X 500) = 5,000 minutes
Hmmm, suddenly the high accuracy gain does not seem to look that high. You are down to 50% effort from 10,000 minutes to 5,000 minutes.
Here, 95% accuracy translated to only a 50% efficiency gain!
Introducing the Confidence Score
IDP systems get around this challenge by offering you a confidence score. A machine learning probability score that tells you how confident the underlying algorithm is that it has extracted the correct value. The trouble with the confidence score is that if it is not 100%, you cannot reliably decide whether you need to look at the extracted data or not. Sometimes the algorithm might give you a confidence score of 65% and be completely right. At other times, it might give you a confidence score of 95% and yet extract the wrong value.
Did you know? A confidence interval is essentially the range of values that are required to match the confidence score threshold value for an entire population. Confidence intervals are usually reported in the context of a margin of error.
But if you have a confidence score that is reliable - by reliable I mean ‘take-it-to-the-bank’ reliable, the system can then tell you when you do not need to look at an extracted value and use it as is. Let’s revisit that math again. This new IDP system is 85% accurate but gives you a reliability score of 50%. That means no one needs to look at the 70% values that it has extracted. Here is the new math:
Value Extracted: 10,000 X 85% = 8,500
Reliable Values: 8,500 X 70% = 5,950
Time spent: (2,050 X 0.5) + (1,500 X 1) = 2,525
This translates into an efficiency gain of almost 75% with a system that has 10% lower accuracy!
There are two challenges with confidence scores
The first one is about awareness. Most customers put undue importance on the accuracy score and do not realize that unless it is 100%, they will need to look at quite a bit of data. This is a fundamental misunderstanding of how probability works. When the accuracy of a field is 90% then there is a 10% chance that the field is 100% incorrect. Unfortunately, there is no way to find out if that field is correct or not other than looking at it. That is why a high accuracy score of 90% does not necessarily mean that you will save 90% of your effort unless the algorithm can tell you which 90 documents out of the 100 are correct.
The second challenge is that most vendors cannot give a reliable, ‘take-it-to-the-bank’ confidence score.
The Infrrd Take
Why do most IDP solutions miss out on the reliability factor when it comes to confidence scores?
The primary reason is usually the level of expertise and cost involved in going deeper as compared to just settling with a character-level language model or word-level scoring.
This is a fairly complex problem and our research team has spent a lot of time on it. It is extremely difficult for most vendors to give this confidence and perhaps that is why they try to steer the conversation to accuracy more than the confidence level. Advancements in technology, and machine learning models in general, allow us to dish out confidence scores with (drum roll please) confidence.
Our patent-pending confidence score algorithm uses novel techniques to look at test data and multiple signals to give you a confidence score so good that you do not have to look at the data for verification. This translates to real cost savings and, more importantly, more time for your data processing teams. From the training data to correct predictions, from the precision-recall curve to true values, with Infrrd you can rest assured of surpassing the confidence score threshold rather easily. Besides, it saves you from the hassle of dealing with incorrect predictions, false positives as well as false negatives from the test dataset.
Whatever your use case may be, make it a point to invest in an IDP solution based on a machine learning model that digs deeper into the data and offers you reliability - the one that offers a ‘take-it-to-the-bank’ confidence score.
Frequently asked questions
What does your pricing model look like?
We price based on the annual volume of pages and complexity of document type. We can get you preliminary pricing once we outlined a solution. Let's do this.
How can I try Infrrd before I commit to a full deployment?
Sure. The first step is to schedule a guided demo where you get to jump into the thick of it. After you explore our solution you can try a proof of concept. When you're ready, you can deploy the system to one use case. Then more use cases. Then across your enterprise.
Glad you asked. Our data extraction process runs on servers. We have found performance and accuracy decline when running on a desktop or mobile device. (Remember Infrrd is running a powerful AI stack).
In a fast-paced world filled with never-ending rivers of documents and data, organizations continuously need smarter ways to work. Teams need flexible solutions that enable them to work faster while delivering higher levels of reliable accuracy than ever before. At Infrrd, we empower teams with Intelligent Document Processing Solutions for Intelligent Work™.