Receipt Transcription Engine
Strictly speaking, Taggun’s receipt transcription engine is more of a Natural Language Processing (NLP) than an OCR engine. Taggun takes advantage of OCR Providers like Google Vision to perform image-to-text OCR. This allows Taggun to rely on the speed and accuracy of an external OCR provider to produce a raw text from an image. See example below:
Raw Text Output from OCR Provider
The raw text output is great, but it is useless for software integration because software is not able to consume the raw text as usable data. Taggun is laser focussed on building the best receipt transcription engine to process this raw text to produce machine consumable output in JSON format. This allows software in the expense management and digital loyalty space to easily integrate with Taggun. See example below:
JSON Output from Taggun
Great News! +1 OCR Provider
In the past, I have received feedback that it is a business continuity risk if Taggun’s receipt scanning API is solely relying on one single provider: Google Vision. And Great News! Taggun has now successfully integrated with Microsoft Cognitive Computer Vision. We don’t have a dependency on a single OCR provider anymore. We can now offer an additional option to our customers to choose an OCR Provider between Google Vision and Microsoft Cognitive.
We also measured the accuracy between these two OCR providers by scanning 285 receipts. Google Vision produces a slightly better result at 83.45%. Microsoft Cognitive is not far behind at 81.42%. And this is a great testament to the robustness of Taggun’s receipt transcription engine to be able to take in any OCR provider to produce the highest quality result.
Data Sovereignty Law
An additional benefit of integrating with Microsoft Cognitive Services is that it allows us to select the location of where the API instance is hosted in. This allows Taggun to offer a solution that is complying with the Data Sovereignty Law without sending and storing the data outside of the country.