Cloud Based Targeted OCR Service

Cloud Based Targeted OCR Service

Cloud Based Targeted OCR Service

Founded in 1987 Blutex manages millions of records and provides office supplies throughout the UK. Initially supplying local businesses, Blutex now service National and Multi-National companies whilst still maintaining the relationships with local businesses.

A key part of their business is digital imaging/document scanning to store physical documents. This might include older documents that need to be digitised, or newer physical contracts that then need to be converted into digital assets. Whether it’s the conversion of outmoded printed collateral or the creation of new material, their digital imaging solutions can scan, convert or store any digital documentation in to a secure digital format that is accessible anywhere in the world.

As part of their offering Blutex wanted to offer a more automated, reliable and efficient OCR service that could extract key bits of data, index the images and allow customers to search the data and view the documents on demand.

Blutex contacted Impact and together we came up with a commercial solution, the key features were;

  • A cloud based solution that can be accessed anywhere at any time, via a browser, without the need to install any software locally.
  • A secure SFTP location in which to bulk upload scanned documents.
  • A template facility so different types of documents can be scanned and key data extracted e.g. Invoice Date, Delivery Date, Account Reference.
  • A secure web interface so that Blutex and it's customers can access the system.

How does it work?

The key element to the solution lies in a clever bit of software that we wrote running on the cloud.  Each type of document that is going to be uploaded e.g. Delivery Note, Invoice, Medical Record, Payment Plan, needs a template creating so the system knows what data to extract and where it might be.

The template builder allows you to define where data is, what format it is expected in e.g. Date, Numbers, Alpha, Special characters.  You can also specify if you want it to automatically change a 'O' (oh) into a '0' (zero) if it's a number field or change an I (eye) into a 1 (one) if it's a number field.

Another key feature is the ability to specify common interpretation errors, so if the system always thinks that the word THREE is spelt THR33, maybe due to a poor font, poor scanning or poor document quality you can configure a set of replacement rules so that once it has captured some text it will convert the word THR33 to THREE then run any other rules.  This can also be done post processing, so if a lot of errors have been reported, the documents can be batched to be reprocesed using the new rules.

Building a template for your document

 

Upload Your Files

Once a template has been defined the software runs as a service waiting for documents to appear in one of it's many inboxes.  Any FTP client or browser can be used to do this, it can be 1 or 10,000 files.

 

Upload Files To The Inbox

As files are uploaded they get identified e.g. the text at x,y says Invoice, then the system targets the document trying to find the requested data items.

The system has a clever algorithym that can compensate for skewed or slipped scanning, it will search around the area looking for a suitable match.

As it locates each bit of data, it gets stored in a database along with the relevant image location.  The document is then moved from the inbox to the outbox, the service will keep on running 24/7 until all inboxes are clear.

Search and Browse Your Data

As soon as the document has been processed it becomes available in the cloud portal for customers to search, browse, view, download or print.

If the system could not extract some data, then this is reflected by the process status, the file can either be re-processed, the template modified and then reprocessed, or re-scanned and uploaded, if the file name stays the same the data is just updated.

Since developing this service the system has processed millions of documents and has saved all stakeholders significant amounts of time.

The system is also capable of storing the documents on other cloud storage bins e.g. Amazon S3

Interested?

If you are interested in learning more about this service or you have a similar requirements please get in touch, we love a chat!

 

Project info

  • Cloud Based Targeted OCR Service
  • Bespoke Software Development, Cloud Portal

My Silversands

My Silversands

https://www.mysilversands.com
Villa rental website with 128 exclusive properties overseas.

In 2009, I moved my business from a company in India, and engaged Impact Technology to maintain and develop my vacation rental website.  Since then the website has undergone many large changes from a complete redesign, to simplifying booking functionality and enhancing usability features.  There have also been many back end changes, simplifying key administration tasks, including the creation of a variety of complex reporting and financial functions.  
 
More recently we have been focused on social aspects and enhancing the delivery of guest services to our customers before arrival and during their stay. Impact have developed bespoke forums for us so that guests can ask questions of other guests prior to booking, and integrated live chat software to the website to support pre-sales. We are about to launch a bespoke private messaging system that will allow guests to communicate, both with each other and key members of staff, before, during and after their visit.

I have found Impact’s input into all of these changes invaluable.  Andy will always find the most efficient and budget conscious solution to a problem, and Jo has often been a sounding board for looking at things from the customers perspective. Thus far it’s been a mutually beneficial relationship and I am confident it will continue to be so.

Prem Chadeesingh, Managing Director