Hussain Arif Hussain is a CS student in Pakistan whose biggest interest is learning and teaching programming to make the world a better place.

Build a text detector in React Native

5 min read 1679

Build a Text Detector with React Native

Say you’re working in a bank where you need to input data from a customer’s form into a computer. To make this possible, you would have to do the following:

  • Input the name and personal details of the customer
  • Furthermore, double-check if you have entered their data correctly. This will prevent human errors
  • In the end, verify whether the entered data is in the right field

While this solution might work, consider this situation: what if your bank receives hundreds of forms every day? Consequently, this would make your job more tedious and stressful. So how do we solve this issue?

This is where OCR (text detection) comes in. It is a technology that uses algorithms to procure text from images with high accuracy. Using text recognition, you can simply take a picture of the user’s form and let the computer fill in the data for you. As a result, this would make your work easier and less boring. In this article, we will build a text detector in React Native using Google’s Vision API.

This will be the outcome of this article:

RN Text Detector Demo

Here are the steps we’ll take:

Let’s get started!

Using Google Cloud Vision

In this section, you will learn how to activate Google’s text detection API for your project.

As a first step, navigate to Google Cloud Console and click on New Project:
Google Cloud Console New Project

Next, we now need to tell Google that we need to use the Cloud Vision API. To do so, click on Marketplace:

We made a custom demo for .
No really. Click here to check it out.

Google Cloud Console Marketplace

When that’s done, search for Cloud Vision API and enable it like so:

Enable Google Cloud Vision API

Great! We have now enabled this API. However, for authorization purposes, Google requires us to create an API key. To make this possible, click on Create Credentials:

Create Cloud Vision API Credentials

In the Credentials menu, make sure the following options are checked:

Credentials Menu Configuration Google Cloud Vision

Next, click on Done. This will bring you to the dashboard page. Here, click on Create credentials, and then API key.

Google Cloud Vision API Key

As a result, the program will now give you an API key. Copy this code to a file or somewhere safe.

Copy API Key Google Cloud Vision

Congratulations! We’re now done with the first step. Let’s now write some code!

Building our project

Project creation

To initialize the repository using Expo CLI, run the following terminal command:

expo init text-detector-tutorial

Expo will now prompt you to choose a template. Here, select the option that says minimal:

Expo CLI Minimal Option

Installation of modules

For this application, we will let the client pick photos from their camera roll. To make this possible, we will use the expo-image-picker module:

npm i expo-image-picker

Coding our utility functions

Create a file called helperFunctions.js. As the name suggests, this file will contain our utility functions that we will use throughout our project.

In helperFunctions.js, start by writing the following code:

//file name: helperFunctions.js
const API_KEY = 'API_KEY_HERE'; //put your key here.
//this endpoint will tell Google to use the Vision API. We are passing in our key as well.
const API_URL = `https://vision.googleapis.com/v1/images:annotate?key=${API_KEY}`;
function generateBody(image) {
  const body = {
    requests: [
      {
        image: {
          content: image,
        },
        features: [
          {
            type: 'TEXT_DETECTION', //we will use this API for text detection purposes.
            maxResults: 1,
          },
        ],
      },
    ],
  };
  return body;
}

A few key concepts from this snippet:

  • The API_KEY constant will contain your API key. This is necessary to incorporate Google’s Cloud Vision API in our app
  • Later on, we created a generateBody function that will generate a payload for our request. React Native will send this payload to Google for OCR purposes
  • Furthermore, this method also accepts an image parameter. This will contain base64-encoded data on the desired image

After this step, append the following code to helperFunctions.js:

//file: helperFunctions.js
async function callGoogleVisionAsync(image) {
  const body = generateBody(image); //pass in our image for the payload
  const response = await fetch(API_URL, {
    method: 'POST',
    headers: {
      Accept: 'application/json',
      'Content-Type': 'application/json',
    },
    body: JSON.stringify(body),
  });
  const result = await response.json();
  console.log(result);
}
export default callGoogleVisionAsync;

Let’s break down this code piece by piece:

  • First, we are generating a payload using the generateBody function
  • Later on, the program submits a POST request to Google’s API and sends the payload as a request
  • In the end, we are outputting the server’s response to the console

Building our image picker component

Create a new file called ImagePickerComponent.js. This file will be responsible for letting the user choose a photo from their gallery.

In ImagePickerComponent.js, write the following code:

import * as ImagePicker from 'expo-image-picker';
import React, { useState, useEffect } from 'react';
import { Button, Image, View, Text } from 'react-native';

function ImagePickerComponent({ onSubmit }) {
  const [image, setImage] = useState(null);
  const [text, setText] = useState('Please add an image');

  const pickImage = async () => {
    let result = await ImagePicker.launchImageLibraryAsync({
      mediaTypes: ImagePicker.MediaTypeOptions.All,
      base64: true, //return base64 data.
      //this will allow the Vision API to read this image.
    });
    if (!result.cancelled) { //if the user submits an image,
      setImage(result.uri);
      //run the onSubmit handler and pass in the image data. 
      const googleText = await onSubmit(result.base64);
    }
  };
  return (
    <View>
      <Button title="Pick an image from camera roll" onPress={pickImage} />
      {image && (
        <Image
          source={{ uri: image }}
          style={{ width: 200, height: 200, resizeMode:"contain" }}
        />
      )}
    </View>
  );
}
export default ImagePickerComponent;

Here’s a brief explanation:

  • After initializing ImagePickerComponent, we created the pickImage function, which will prompt the user to select a file
  • If the user submits an image, the program will run the onSubmit handler and pass the image’s base64 data to this function
  • Furthermore, after the submission, the app will display the image to the UI

All that’s left for us is to render our custom image picker component. To do so, write the following code in App.js:

import ImagePickerComponent from "./ImagePickerComponent";

return (
  <View>
    <ImagePickerComponent onSubmit={console.log} />
  </View>
);

Here, we are rendering our ImagePickerComponent module and passing in our onSubmit handler. This will log out the chosen image’s encoded data to the console.

Run the app using this Bash command:

expo start

Running an Image Picker Component for the Text Detector

Our code works! In the next section, we will use the power of Google Vision to implement OCR in our app.

Connecting Google Cloud Vision with the image picker

Edit the following piece of code in App.js:

import callGoogleVisionAsync from "./helperFunctions.js";
 //code to find:
return (
  <View>
    {/*Replace the onSubmit handler:*/}
    <ImagePickerComponent onSubmit={callGoogleVisionAsync} />
  </View>
);

In this snippet, we replaced our onSubmit handler with callGoogleVisionAsync. As a result, this will send the user’s input to Google servers for OCR operations.

This will be the output:

Text Detector Google Vision

Notice that the program is now successfully procuring text from the image. This means that our code was successful!

As the last step, attach this piece of code to the end of callGoogleVisionAsync:

//file: helperFunctions.js. 
//add this code to the end of callGoogleVisionAsync function
const detectedText = result.responses[0].fullTextAnnotation;
return detectedText
  ? detectedText
  : { text: "This image doesn't contain any text!" };

This tells the program to first check if there was a valid response. If this condition is met, then the function will return the extracted text. Otherwise, an error will be thrown.

In the end, your complete callGoogleVisionAsync function should look like this:

//file: helperFunctions.js
async function callGoogleVisionAsync(image) {
  const body = generateBody(image);
  const response = await fetch(API_URL, {
    method: "POST",
    headers: {
      Accept: "application/json",
      "Content-Type": "application/json",
    },
    body: JSON.stringify(body),
  });
  const result = await response.json();
  console.log(result);
  const detectedText = result.responses[0].fullTextAnnotation;
  return detectedText
    ? detectedText
    : { text: "This image doesn't contain any text!" };
}

Rendering OCR data to the UI

Now that we have implemented OCR in our program, all that remains for us is to display the image’s text to the UI.

Find and edit the following code in ImagePickerComponent.js:

//code to find:
if (!result.cancelled) {
  setImage(result.uri);
  setText("Loading.."); //set value of text Hook
  const responseData = await onSubmit(result.base64);
  setText(responseData.text); //change the value of this Hook again.
}
//extra code removed for brevity
//Finally, display the value of 'text' to the user
return (
  <View>
    <Text>{text}</Text>
    {/*Further code..*/}
  </View>
);
  • When the user has chosen an image, set the value of the text Hook to the response data
  • In the end, display the value of the text variable

Text Detector RN Final Output

And we’re done!

In the end, your ImagePickerComponent should look like so:

function ImagePickerComponent({ onSubmit }) {
  const [image, setImage] = useState(null);
  const [text, setText] = useState("Please add an image");
  const pickImage = async () => {
    let result = await ImagePicker.launchImageLibraryAsync({
      mediaTypes: ImagePicker.MediaTypeOptions.All,
      base64: true,
    });
    if (!result.cancelled) {
      setImage(result.uri);
      setText("Loading..");
      const responseData = await onSubmit(result.base64);
      setText(responseData.text);
    }
  };
  return (
    <View>
      <Button title="Pick an image from camera roll" onPress={pickImage} />
      {image && (
        <Image
          source={{ uri: image }}
          style={{ width: 400, height: 300, resizeMode: "contain" }}
        />
      )}
      <Text>{text}</Text>
    </View>
  );
}

Conclusion

Here is the source code for this article.

In this article, you learned how to use Google Cloud Vision in your project and implement text detection capability. Other than data entry, we can use our brand new OCR app for several situations, for example:

  • Retail and marketing: Supermarkets use text detection technology to scan and store discount coupons. Here, the OCR program scans the text and checks whether the coupon is usable
  • Business: When the user writes their data on a paper document, government institutions use OCR to digitize the record. In this use case, the software first scans the client data and validates it to ensure that the user has followed a given format

If you encountered any difficulty, I encourage you to play with and deconstruct the code so that you can fully understand its inner workings.

Thank you so much for making it to the end! Happy coding!

LogRocket: Instantly recreate issues in your React Native apps.

LogRocket is a React Native monitoring solution that helps you reproduce issues instantly, prioritize bugs, and understand performance in your React Native apps.

LogRocket also helps you increase conversion rates and product usage by showing you exactly how users are interacting with your app. LogRocket's product analytics features surface the reasons why users don't complete a particular flow or don't adopt a new feature.

Start proactively monitoring your React Native apps — .

Hussain Arif Hussain is a CS student in Pakistan whose biggest interest is learning and teaching programming to make the world a better place.

2 Replies to “Build a text detector in React Native”

  1. when i try your tutorial above i got the message me have to activate the billing on GCP on the project that i use. i thought the Cloud Vision API could use as free hahaha

  2. Very interesting to learn in detail about the subtleties of implementation with text, various nuances, I wonder if the implementation will also work steadily with large amounts of data. Thank you!

Leave a Reply