OCR on Android

What is OCR?

Optical character recognition (OCR) refers to the process of automatically identifying from an image characters or symbols belonging to a specified alphabet. In this post we will focus on explaining how to use OCR on Android.

Once recognized the text of the image, it can be used to:

  • Save it to storage.
  • Process or edit it.
  • Translate it to other language.

Popularity of smartphones combined with ever better cameras has led to an increase in the use of this type of recognition techniques and a new category of mobile apps that make use of them.

On device or in the cloud?

Before using an OCR library, it is necessary to decide where the OCR process should take place, on the smartphone or in the cloud.

Depending on app requeriments, each approach has its advantages and disadvantages.

OCR en Android. OCR on Android. Cloud or local.

If the app requires, for example, performing character recognition without internet connection, the OCR engine will be launched on the device itself. In this way, sending images to a server could be avoided because cameras mounted on current devices can take large photos.

On the other hand, OCR libraries tend to occupy much space, being necessary to download each of the languages to recognize, as we will explain below.

What libraries can be used?

In the following link to Wikipedia there is a comparative table with all OCR libraries, supported platforms, programming languages used in its development and other relevant information.

Link: http://en.wikipedia.org/wiki/ List_of_optical_character_recognition_software

In this post we are going to use Tesseract library, that stands out above the rest. It is Open Source, has SDK, was created by HP and is currently developed by Google.

OCR on Android using Tesseract Library

Althoug Tesseract can be run on a Linux server as a cloud service, in this post we will implement Tesseract library in an Android app, launching the OCR engine on the device itself.

The original Tesseract project for Android is called Tesseract Android Tools and contains tools for compiling the Tesseract and Leptonica libraries for use on the Android platform, and a Java API for accessing to these natively-compiled libraries.

Link: https://github.com/rebbix/tesseract-android-tools/tree/master/tesseract-android-tools

For our example, we are going to use a fork of Tesseract Android Tools, which adds more functionality.

Link: https://github.com/rmtheis/tess-two

OCR Example on Android

We need a few simple steps to perform OCR on Android:

  1. Create a new Android Studio project.
  2. Add Tesseract library to the project adding the following lines to build.gradle:
    dependencies {
        compile 'com.rmtheis:tess-two:6.0.0'
    }
  3. Create a class called TessOCR with the following code:
    import android.content.Context;
    import android.graphics.Bitmap; 
    import com.googlecode.tesseract.android.TessBaseAPI; 
    
    public class TessOCR { 
        private final TessBaseAPI mTess; 
    
        public TessOCR(Context context, String language) { 
            mTess = new TessBaseAPI(); 
            String datapath = context.getFilesDir() + "/tesseract/"; mTess.init(datapath, language); 
        }
     
        public String getOCRResult(Bitmap bitmap) { 
            mTess.setImage(bitmap); return mTess.getUTF8Text(); 
        }
     
        public void onDestroy() { 
            if (mTess != null) mTess.end(); 
        } 
    }
    • Constructor needs a context (for example MainActivity context) and the language to recognize that is used to start OCR engine. Language must be in 639-2/B ISO format. Example: spa (spanish), chi (chinese).
    • Note:
      • To recognize each language, it is necessary to download a file and save it on device storage. In our case, it will be stored in the app data directory followed by /tesseract/.
      • These files that are used to recognize each language can be found in the next link: https://github.com/tesseract-ocr/tessdata.
    • getOCRResult method will return the recognized text from the image we pass as argument.
  4. Import TessOCR class created in previous point to Main Activity and create a new recognition instance with the following line:
    mTessOCR = new TessOCR(this, language);
  5. Add in MainActivity the method to perform character recognition.
    private void doOCR (final Bitmap bitmap) {
       	 if (mProgressDialog == null) {
            mProgressDialog = ProgressDialog.show(ocrView, "Processing",
                    "Doing OCR...", true);
       	 } else {
           	 mProgressDialog.show();
       	 }
       	 new Thread(new Runnable() {
            	public void run() {
                	final String srcText = mTessOCR.getOCRResult(bitmap);
                	ocrView.runOnUiThread(new Runnable() {
                   	 @Override
                  	  public void run() {
                        // TODO Auto-generated method stub
                        if (srcText != null && !srcText.equals("")) {
                            //srcText contiene el texto reconocido
                        }
                        mTessOCR.onDestroy();
                        mProgressDialog.dismiss();
                    }
                });
            }
        }).start();
    }
    • Firstly, this code starts a progress dialog indicating recognition status. Then launches a new thread of execution that will make recognition calling getOCRResult method. When recognition is finished, the dialog is dismissed and the recognized text is saved in srcText if everything has worked properly.
  6. Call step 5 method to start recognition.
    doOCR(bitmap);

Considerations

  • Recognition quality may vary depending on the image lighting conditions, camera resolution, text font, text size and others …
  • To achieve the highest possible quality, it is very important to center the text in the image and image is properly focused.

Preview using OCR in a translator app

The following video shows part of the app I’m developing for my degree final project (TFG), where I use the OCR techniques described.

 

Related articles

Use Android Priority Job Queue library for your background tasks

PDF reports in Android

Discovery of nearby Bluetoth devices in Android

ASO – Position your app in the App Store or Google Play

 

Share on LinkedInTweet about this on TwitterShare on FacebookShare on Google+Buffer this page

6 thoughts on “OCR on Android

    • Hi Paul, thank you for your comment.

      In the project in which I wrote this code, ocrView was an activity, because I used a MVP (Model-View-Presenter) pattern and the code included in 4 and 5 points belongs to a Presenter, not to the Main Activity. I have to update it to avoid confusion.

      Answering your question, you have two options:

      1.Use the 4 and 5 points code in a presenter and declaring ocrView like below:


      private final OCRActivity ocrView;

      public OCRViewPresenter(OCRActivity view) {

      ocrView = view;
      }

      2. Implement OCR on the activity itself, using this instead of ocrView in the first occurrence and YourActivityName.this in the second one (YourActivityName.this.ocrView.runOnUiThread)

    • Hi manish, thank you for your comment.

      OCR techniques are likely not to be accurate if:

      – Lighting conditions are poor.

      – Camera resolution is low.

      – Text font is not big enough.

      So try to achieve the highest possible quality, center the text in the image and focus properly the image. If you have any question after that, please let me know.

Leave a Comment

By completing the form you agree to the Privacy Policy