Image Recognition: Label Detection using Google Cloud Vision API

Google Cloud Vision API helps in label detection, face detection, logo detection, landmark detection and text detection. In this article, we will see how can we use Google Cloud Vision API to identify labels in the image? This is a step by step guide for label detection using Google Cloud Vision API. Let's follow it.

Step 1: Setup a Google Cloud Account

A) Go to: https://console.cloud.google.com/
B) Login with your google credentials
C) You will see a dashboard. Create a Project if not already created.


Step 2: Enable Cloud Vision API

A) Go to console
B) Click on Navigation Menu
C) Click on API & Services >> Library
D) Search "cloud vision" and you will get the "Cloud Vision API". Enable this API if not already enabled.


Step 3: Download credentials file

A) Go to console
B) Click on Navigation Menu
C) Click on API & Services >> Credentials
D) Click on Create Credentials dropdown >> Service account key >> New service account
E) Enter Service account name
F) Select any role. I had selected Project >> Viewer
G) Save the file as JSON on your hard drive. Rename it to 'credentials.json'.

Step 4: Add billing information

A) Go to console
B) Click on Navigation Menu
C) Click on Billing

Now open the Jupyter notebook and try using this API. You can download my Jupyter notebook containing below code from here.

 
Step 5: Import required libraries

from googleapiclient.discovery import build
from oauth2client.client import GoogleCredentials
from base64 import b64encode


You may get import error "no module name..." if you have not already installed Google API Python client. Use following command to install it.

pip install --upgrade google-api-python-client

If you also get import error for oauth2client, you must install it using following command:

pip3 install --upgrade oauth2client

Step 6: Load credentials file

Load the credentials file (which we created in step 3) and create a service object using it.

CREDENTIAL_FILE = 'credentials.json'
credentials = GoogleCredentials.from_stream(CREDENTIAL_FILE)
service = build('vision', 'v1', credentials=credentials)

Step 7: Load image file (which needs to be tested)

We will load an image of a cat and encode it so that it becomes compatible with the cloud vision API.














IMAGE_FILE = 'cat.jpg'
with open(IMAGE_FILE, 'rb') as file:
    image_data = file.read()
    encoded_image_data = b64encode(image_data).decode('UTF-8')

Step 8: Create a batch request

We will create a batch request which we will send to the cloud vision API. In the batch request, we will include the above encoded image and the instruction as LABEL_DETECTION.

batch_request = [{
    'image':{'content':encoded_image_data},
    'features':[{'type':'LABEL_DETECTION'}],
}]

Step 9: Create a request

request = service.images().annotate(body={'requests':batch_request})

Step 10: Execute the request

response = request.execute()

This step will throw an error if you have not enabled billing (as mentioned in step 4). So, you must enable the billing in order to use Google Cloud Vision API. The charges are very reasonable. So, don't think too much and provide credit card details. For me, Google charged INR 1 and then refunded it back.

Step 11: Process the response

For error handling, include this code:

if 'error' in response:
    raise RuntimeError(response['error'])


We are interested in label annotations here. So, fetch it from the response and display the results.

labels = response['responses'][0]['labelAnnotations']

for label in labels:
    print(label['description'], label['score'])

Output:

Cat 0.99598557
Mammal 0.9890478
Vertebrate 0.9851104
Small to medium-sized cats 0.978553
Felidae 0.96784574
European shorthair 0.960582
Tabby cat 0.9573447
Whiskers 0.9441685
Dragon li 0.93990624
Carnivore 0.9342105

You can test the above code using different images and check the accuracy of the API.