Story of Akshay and the 3 year long path to patent
by Anusha Venkatesh, on 30 July 2021
Meet Akshay Uppal, one of Infrrd’s brilliant machine learning engineers. He just got granted his very first patent for developing a system to extract data from unconventional images. And not just any images but from stamps and other complex images on the documents.
His invention has unlocked new pathways for Intelligent Document Processing and we had a chance to interview him and here’s how it went.
Interviewer: Tell us about your journey to the patent.
Akshay: This was one of the very first projects I picked up when I joined Infrrd 3 years ago. At the time, the team was working on a requirement to extract information from invoices and receipts like documents and as a subset of that use case, we had to extract important information from the stamps present on these documents. I was handed this subset of the problems which soon turned into solving the bottlenecks of OCR itself in capturing data from images.
Interviewer: What were the bottlenecks?
Akshay: So here’s what happens when extracting data from images with stamps.
All the existing OCR solutions are incapable of robustly extracting text from images that have three bottlenecks:
- Random Orientation of the Images: In cases of logos, stamps, and even scanned documents it is not possible to always have a fixed orientation, and since every image could have a random orientation the OCR can not take into account this randomness and is unable to produce accurate results. While the orientation can be corrected to some extent in some cases by traditional image processing techniques and also being done by some high-end OCR solutions, it's not a robust method as there could be a lot of variations among documents, especially in the case of logos, stamps etc.
- Occlusion in text: OCR solutions are not yet capable of fully isolating background from foreground, this means that if there is some background text overlapping the target text the OCR is incapable of producing results properly.
- Unconventional Text: OCR solutions mostly expect almost horizontal texts or uniformly oriented text, in use cases of stamps, logos, and banners this condition is not always true and the existing OCR solutions cannot handle such problems.
Pretty soon we understood that there are no existing solutions to these OCR bottlenecks currently and that we had to develop one.
Interviewer: And how did that go?
Akshay: We used deep learning models to separate the stamp out and predict the orientation of the stamp. Based on that, it extracts numbers and letters from it.
This not only helped us separate the stamps and the original text underneath, but extract data from stamps successfully including locations, pincodes etc.
We further extended the solution to help extract data from other unconventional images like logos and more.
Today, this invention has created an entire solution suite for Infrrd’s proprietary platform.
Interviewer: Of all the places, why Infrrd?
Akshay: The work that I do at Infrrd is pretty challenging. There is a lot of research that goes into deep learning and machine learning. There is a significant gap between the industry and said research. At Infrrd, we’re trying to bridge this gap. The work I do here is intellectually stimulating and keeps me on my toes.
Another thing about Infrrd is that we are a close-knit community. We are continuously updating our platform. We all have the same big picture in mind that we are working towards together. That sense of inclusion and seeing your work make an impact is encouraging. That’s what is keeping me at Infrrd.
In the end:
That was Akshay Uppal in his own words. Akshay who joined us 3 years ago is now an integral part of the team. His passion to innovate and enthusiasm to execute is infectious. Kudos. Akshay! You made your Infrrd family proud.