How to Improve Your Accounts Payable with AI-Powered OCR

Revolutionise accounts payable OCR: Unleash the power of AI for effortless receipt and invoice processing.

Published in

Receipt and Invoice OCR

For a long time, Accounts Payable (AP) workflows have been all about brute forcing the issue with manpower, opportunity costs in repetitive tasks, and paper. Needless to say, this isn’t ideal or scalable, and invoice processing is one of those areas where automation can add immediate ROI.

But wait! Digitisation alone is not digital transformation. Sure, you could mandate e-invoices – PDFs only via emails – and then use general-purpose OCR (Optical Character Recognition) to get text content. You might cut your carbon footprint, but this approach doesn’t give you anything that your Accounts Payable can actually use.

A bunch of unstructured text isn’t something you can just plug into the correct fields in your existing ERP systems and call it a day. You’d still need AP personnel to look through each digitised PDF and manually enter data, meaning you’ve ultimately gained nothing.

There are two ways you can use automation to overcome the limitations of general-purpose OCR. The conventional way: zonal/template-based, tightly coupled on-premise OCR solutions, and the next-gen way: an AI-powered, cloud-based solution.

Let’s take a look at each, and see why sticking to convention might have hidden costs that add up over time.

The Real Cost of Following Convention

As the name implies, Zonal or Template-based OCR is a way to selectively extract data from documents, looking at only specific areas – zones – instead of grabbing everything that vaguely resembles text. These algorithms have been programmed to recognise specific formatting and/or placement of images and text – templates – and can process invoices/receipts into usable data because they have been told where to look.

How does this work? Software engineers specifically tell the program which patterns of data to watch out for (regular expressions), or which coordinates map to what information (spatial bounding boxes). So the technology knows where to find, say, vendor information, or total price, for a given template.

Sounds good! This should work well for bulk processing of documents which are guaranteed to be unchanged in their formatting/layout, and all you’d have to worry about is the next software update…right?

Not quite.

Scalability isn’t your only problem. There are more issues here than you might think.

‍

1. Keeping Up With The Vendors

The obvious drawback is, of course, that bespoke solutions for every single type of invoice can quickly get out of hand as you grow beyond a mom-and-pop shop. Vendor rotations and progressively changing invoice layouts are a given with any organisation, in any industry. So your software engineers will have to build new templates every time you do business with someone new, or if one of them updates their invoice format.

What might not be so obvious (unless you’ve ever worked at IT in any of these companies), is what’s about to happen next.

Templates for roughly a third of all your invoices might never get made. Commissioning new templates is expensive, so management will try to cut costs wherever possible. Things will fall through the cracks, there will be communication gaps, and ultimately, to save time, most organisations will end up just saving invoices to a PDF and asking AP to enter them into the system manually, i.e., old-school data entry.

So much for Digital Transformation.

2. Computational Costs

Templates are expensive and time-consuming to create, and even if you have templates set up perfectly you’ll run into bottlenecks when actually processing them.

Think of the workflow triggered whenever an invoice comes in. Even if you have the template for this particular supplier, you’ll need to do a check on your end to see which template fits this invoice that just waltzed in.

That’s not one, but two storage and bandwidth overheads:

Creating and storing templates on your backend for each new type of invoice/supplier, and
Creating code and making network infrastructure considerations to run through all templates in storage and see if you have one that looks like the document before you.

If you’re using services for any of these that charge per storage unit/compute time, this will add up exponentially – and that’s not even considering the latency cost! Your AP workflow is held up while you play matchmaker with the current invoice, costing you money every second.

3. Infrastructure & Maintenance Costs

These are costs associated with the buying and maintaining of on-prem hardware that hosts your OCR solution. Since you will own these concerns yourself, you are responsible for acquiring, maintaining, and (most importantly) regularly updating them – with any personnel/expertise cost that entails.

This is a combination of OCR licensing fees, servers (whether bare metal or provisioned), error correction, engineers’ wages for debugging and regular updates, and any microservices you’re using behind the scenes to wire parts together.

Now that you know how convention might hold you back, let’s move past the 1970s with an AI/ML powered solution.

4. Receipt & Invoice Duplicates

While some automation tools merely digitise documents without extracting usable data, TAGGUN's AI-powered solution not only accurately extracts structured data but also offers features designed to detect duplicates and even manage receipt fraud, adding an extra layer of protection against financial discrepancies.

Out with the Old, In with the New

Our API is a real-time, scalable, easily integrable solution to automate data extraction and structured data generation from receipts and invoices. It is a REST API in the cloud, serving as a gateway to a powerful AI that learns each time it is used, and never stops getting better.

This is a next-gen approach to OCR altogether – taking what works from two generations of conventional approaches, and discarding what doesn’t.

You have the choice between Google Vision AI and Azure Cognitive Services – two state-of-the-art general-purpose OCR tools (First Gen) – and adds Regular Expressions (Second Gen/Template OCR). Our API uses a cognitive machine learning solution, with fuzzy matching to detect logically related fields. All with pricing that is transparent, predictable, and a fraction of what building your own on-prem OCR solution would.

How is TAGGUN Different from Template OCR?

It’s a completely new paradigm.

TAGGUN doesn’t need rule-based matching, programmers telling it where to look explicitly, or ad-hoc templates built for each and every vendor you do business with…because our API knows what receipts and invoices are, to a human level of accuracy. This AI has been trained on millions of invoices, and can scan fields in invoices/receipts like a real human would, knowing conceptually what the relevant parameters are for this use case, and where they would be – even if the receipt is in a previously unseen format, language or even handwritten!

From your point of view, all you do is send images/scans of invoices to our secure API endpoint in the cloud, and get back parsed, structured data that you can directly feed into your existing AP workflows.

What Does TAGGUN Mean for Your AP Team?

1. Scalability is now a solved problem.

With TAGGUN, you can process invoices with 90+% accurate results in under 5 seconds even if the specific vendor has never been in your system before, as long as their documents fit some definition of ‘invoice’ that TAGGUN knows. No need for hundreds of bulky, bespoke templates.

2. You won’t need any major refactors of your existing ERP systems.

Integrating TAGGUN into your existing AP workflow is dead simple – literally just a matter of writing boilerplate code in your language of choice to make POST calls (with scans of your invoices/receipts in JPEG, PDF, PNG8, PNG24, GIF, and HEIF formats) to the our servers, and getting back structured data in JSON within seconds.

All of this data, end-to-end, being in industry standard formats means you can feed actionable results directly into most ERP systems in use today.

3. You’ll see ROI in days, not months or years.

TAGGUN can be set up in a matter of hours. Using it is no different than using a conventional REST API – one that is intuitively designed, with plenty of documentation and live support. Unlike Template-based OCR, there is no extensive staff training, configuration, or recalibration needed. RESTful APIs have been around forever, after all – and that’s all you have to explain to your team; not the AI, machine learning, and fuzzy logic involved under the hood. That’s TAGGUN’s domain.

In fact, with the degree of automation that TAGGUN enables, you might not need most of your backend staff working on OCR at all, freeing them up to work on long-term feature additions to your platform, and its growth.

4. You can keep your existing infrastructure.

TAGGUN as a decoupled solution asks nothing of you in terms of hardware – unlike many OCR libraries which insist on powerful, vendor-specific hardware like NVIDIA – and is completely resilient to technology or business-requirement changes.

You are free to use whatever infrastructure/architecture that best fits your business. Already have on-prem, bare metal ARM-based hardware? You can keep all of it. Want to go cloud-native and join the serverless revolution by having your SaaS be on the AWS ecosystem? Go nuts! All of it pairs perfectly fine with TAGGUN, allowing you to grow and scale as elastically as you want.

5. You have an easier route to Digital Transformation.

TAGGUN’s OCR and data extraction capabilities are entirely self-contained; not tied to your existing code or architectural decisions in any way. This loose coupling means you can freely add as much tooling around it as you want, and shape your organisation the way you want.

Here’s an example. Since using TAGGUN is literally just a matter of making an API call in code and nothing else – you could add logic in your own internal tools to, say, trigger an alert and flag an invoice in the database every time vendor tax identification is missing in the TAGGUN response. The possibilities for this are endless, from freeing up your AP team by routing requests internally automatically to service teams, all the way to fraud detection.

TAGGUN adds value by abstracting away the traditional pain points of AP workflows, opening you up to possibilities you might not have considered for your organisation.

TAGGUN: AI-based Data Extraction that Pays for Itself

While designing any AP workflow, the two questions you should ask yourself are:

What is the human cost involved?
What is my upgrade path?

With TAGGUN, you certainly save on the human cost:

You no longer need as many humans-in-the-loop now that TAGGUN’s AI-powered engine makes frequent error correction and compatibility updates a thing of the past,
You save on hiring AI/ML specialists and infrastructure engineers of your own since TAGGUN lives entirely in the cloud as far as you’re concerned; you merely interface with it. Your existing personnel are freed up to work on other projects and create value in other ways instead of being bogged down in data capture banalities.
You can still use real humans in the loop for more accuracy – and rapid fine-tuning of the learning model with TAGGUN’s engineers – but remember, TAGGUN is a solution that learns and gets better. By virtue of AI, you’ll see your error rates drop progressively the more you use it, and so will your need for humans in the loop.

And as for the future? Further technological improvements will give you access to even faster processing, and even more storage, even better bandwidth. Cloud-based OCRs like TAGGUN are only going to get faster, better, and more accurate, while Zonal/Template-based OCRs are already seeing bottlenecks going from local enterprises to even regional concerns. Also, while general-purpose OCR tools such as tesseract receipt ocr are useful for digitising various text documents, they often struggle with the specific and nuanced requirements of processing financial documents like receipts and invoices at scale.

With this powerful combination of technologies, cloud-powered AP is set to become the de-facto standard going forward, and with TAGGUN, you can rest easy knowing you’re ready to face it.