Taggun’s Architecture Overview

How our OCR API architecture extends beyond just a standard receipt OCR service.

Taggun application is hosted in AWS, currently online in the California and Sydney regions. Each respective Virtual Private Cloud (VPC) has a unique public IP Address. And the internet traffic is routed to the region with the lowest latency. The workload is further distributed to multiple instances of the Taggun application using Docker container-based technology inside EC2 virtual machines.

Each instance of a virtual machine performs the same function and acts as an active-active cluster. In case of a regional-wide outage, the traffic will safely failover to the other region(s), supporting with the majority of disaster recovery scenarios.

The System Architecture Design Diagram shows the key components of the system.

SSL + HTTPS Traffic

All internet traffic is encrypted and secured by HTTPS. The SSL Certificate is stored in reverse proxy server. The SSL termination is also handled by reverse proxy server.

Amazon Cloud DNS

Amazon Cloud DNS (Amazon Route 53) is a highly available and scalable cloud Domain Name System (DNS) web service. Taggun relies on Amazon Route 53 to dynamically route traffic to the lowest latency Virtual Private Cloud(VPC). Integrated with Amazon CloudWatch, Taggun will automatically DNS failover and route your traffic from an unhealthy resource to a healthy resource after 3 to 5 minutes of consecutive failure.

Virtual Private Cloud (VPC)

A VPC is a logical isolated section of the Virtual Machines and resources in a virtual network. Each VPC has a virtual firewall configured to control the inbound and outbound traffic for the virtual machines.

EC2 Virtual Machines

The Virtual Machines are m4.large instances. 2vCPU with 8GB of memory and 80GB SSD storage. The host OS is Ubuntu Server 16.04.2 LTS. The Taggun architecture can both scale up with larger EC2 instances and scale out by adding additional EC2 instances to a cluster.

Reverse Proxy

Each Virtual Machine has a NGINX reverse proxy. All external internet HTTPS/HTTP traffic must be routed through the reverse proxy into the Docker Network.

Kong API Gateway

Kong API Gateway runs in front of all API requests. It acts as the API authentication layer and rate-limiter to prevent DDOS attacks.

Taggun Application

Taggun application is written in Node.js. The API follows OpenAPI 2.0 Specification and exposes Swagger API documentation.

Docker Containers

Taggun OCR web application is hosted as Docker Container. The workload is distributed to 10 or more Docker instances running in each Virtual Machine. The number of Taggun OCR web applications can be scaled out further on-demand.


NoSQL database to store data for billing and training purposes. Operations to MongoDB is replicated across other regions in real-time. The application connects to MongoDB as a replica set.

S3 Storage

All files are stored in AWS S3 Storage and are encrypted with an AES 256 algorithm. By default, files stored will be deleted after 60 days to assist with the training of the machine learning algorithm. Optionally, Taggun can be configured with a higher or lower image retention threshold based on the client’s requirements.

Google Vision

Google Vision API is used to perform simple image-to-text OCR scan.

We are excited to build something awesome with you 🚀

Talk with our AI experts about an OCR solution, pricing or if you want support.


Email us on hello@taggun.io or

Book a Meeting Now