How-to: Google Docs OCR

Update May 1, 2015: (a9t9) launched its very own free and open-source Online OCR service - try it out and let us know how it compares.

How to convert an image or (an image inside a) PDF into text using Google’s free OCR converter. They confusing part is: There is no button, no checkbox and no mentioning of OCR in the Google Docs of 2015. So if you cannot find it, it is not you being stupid…

Google OCR works indirectly by automatically scanning and converting images to text – but only if Google Docs “thinks” there is text in an image. In other words: If there is no text that Google can recognize – nothing happens. But now…. step by step:

OCR with Google Docs/Google Drive

If you are not there, leave Google Docs and go to Google Drive

Google Drive opens in a new tab, silently!

Upload the image to Google Drive

Select 'Upload files' and upload your image or PDF.
File size limitations: The maximum size for images (.jpg, .gif, .png) and PDF files (.pdf) is 2 MB.
For PDF files, Google looks only  at the first 10 pages when searching for text to extract.

Start the OCR conversion: Right-click on the file and select “Open with Google Docs”

kkkk

Done:  Now your image is inside a Google Doc and the extracted text is below the image

As with many things, once you know them its easy. The OCR’ed text is below the embedded image.
If there is no text, then Google could not extract anything and fails silently.
Example Google Doc with the conversion result https://docs.google.com/document/d/1AUPOWk9laXMLD0G-WT7DtFQUAOANCBRjF29-zmAN71o/edit?usp=sharing

Automatic OCR on uploads: In GDrive (not Google Docs!) click on the cog at top right, then select “Settings”, and in the popup that opens you can change the upload settings. For an automatic conversion to the Google doc format (and the automatic document OCR that comes with it) select the box at “Convert Uploads” as shown in the screenshot:

Automatic OCR processing: In GDrive click on the cog at top right, then check the Convert uploads box.

Of course, what automatic OCR does not not solve is my #1 complaint with Google OCR: One never knows if and when the OCR process kicks in, and if it does not, why not.

Google OCR API

Many asked me, but unlike from Baidu, there is no dedicated Google OCR API available. If you insist on using Google for doing OCR, you can only use the Google Drive REST API to upload/insert documents to Google Docs. Basically you use the API to replicate the manual process described above. The API takes parameter such as:

  • ocr - Whether to attempt OCR on .jpg, .png, .gif, or .pdf uploads. (Default: false)
  • ocrLanguage - If ocr is true, hints at the language to use. Valid values are BCP 47 codes.
  • Details here: Google Drive REST API

Some more OCR tips for better results:

1. Example of images types of files suitable for OCR:

  • Image or PDF files obtained using flatbed scanners
  • Photos taken with digital cameras or mobile phones
  • Screenshots (e. g. from Youtube videos)

2. For best results, the image or PDF files need to meet certain requirements:

Resolution: High-resolution files work best. Google recommends each line  of text in the documents to be of at least 10 pixels height.

Orientation: Only documents with horizontal left-to-right text are recognized. If  you’ve scanned or captured a document in a different orientation, you can use a program like “Windows Photo Viewer” (part of Windows!) to retouch and edit images to rotate them before uploading to Google Drive.

File size limitations: The maximum size for images (.jpg, .gif, .png) and PDF files (.pdf) is 2 MB. For PDF files, Google looks only  at the first 10 pages when searching for text to extract.

3. Further reading: