As an example project, something I’m working on required me to read the text out of an image then highlight keywords. I broke this down into three steps.
I’ll start with giving you access to the finished project code. To run it, you will have to “File–>Make A Copy” of the project. Then follow the steps on this webpage word-for-word to Activate the DriveAPI and developers console and you’ll be fine. Sometimes the developers console site doesn’t want to load, just keep trying it until it works. When you want to run the code, you click on “Publish–>Deploy as web app”. The first time you’ll have to set who you will allow to have access to the app. They will also have access to your google drive if you allow them to execute as your account so be careful. Once you’ve set this, click “Deploy”. Then you’ll get the link to the web app. The first time you run it you will have to give it access rights to your google drive.
The Guts and Explanation:
Luckily while searching, I stumbled across the blog of Amit Agarwal which if full of great example code. For step one, the simplest solution I could find was his code example that uses a custom HTML form upload an image to your google drive account. This might seem like overkill. Why didn’t I just use the fileID of something already in my google Drive? Well, in this particular application, the user might be uploading multiple files from a camera on a mobile device. This form setup seemed to be a simple web-based uploader which gives me complete access to the guts of what’s going on. This will help me get the fileIDs easily for different files.
For the second step, the OCR, I had attempted multiple times and methods for getting an Android native OCR library to work. Every tutorial I found was highly dependant on the version of Android SDK or Eclipse I had, or I would just hit a dead end when I tried to compile. Plus, those would only work on Android, not every device. I really wanted a web app. That’s actually how I stumbled upon Google Apps Scripts again. I had played with it a bit in the past, but this time, after finding some great example OCR code from Amit’s website, I found GAS much more accessible. I added this code to the HTML form example from Step 1 above and tweaked it a bit.
In the original OCR code snippet the script reads an image, then creates a Google Doc with the image followed by the text that it recognized. This file is saved this into the root folder of google drive. Since the uploader code (from Step 1) saved its picture into a subfolder, I made the OCR code save into that same folder. The way to do this is to add a “parents” ID tag to the properties of the OCR file. Since I already had the folderID from the uploader code, that was pretty easy to sort out. This was not super easy to figure out. As I said, I couldn’t find a lot of info on GAS language and this was one of the things that took me a while to find. By looking at how Google’s other services save a file into a subfolder, I was able to do the same thing in GAS. You can see the results below.
I tweaked the form code to print out the URL of the newly created Doc file with the searchable text. I couldn’t figure out how to get logging to work in GAS at all. Every time I tried to run something that should print to the console, a console would briefly appear, but disappear before I could read anything. So I just stuck with having the form print the URLs. Then I could copy and paste them into the address bar to visit the document.
The third step was searching for keywords within the text. Again, someone else had done the hard work for me. I simply tweaked the code and pasted it as a function into my script.
To test the code I printed a page of public domain POE-etry and took a picture of the page with my cellphone to simulate how a user might do. To make sure that I’d get a lot of hits on my text search, I hardcoded the keywords for the search to be the rhyming sounds in the poem. In this case –oor and –ore. Then I uploaded the image to my Google Drive using the HTML form. After a couple of seconds (it takes a couple of seconds to upload) all I needed to do was open the newly created OCR DOC file and see how well the OCR worked.
As I said before, the Google Drive’s OCR process pasts the image into the file, then translates the image to text. Even in normal room lighting at night time it was really good at translating the text and easily searchedand highlighted the keywords I hardcoded. Here’s the resulting file.