TensorFlow's Object Detection API

Revision as of 21:40, 21 June 2021 by Admin (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search



The purpose of this tutorial is to explain how to train your own CNN object detection classifier for multiple objects, starting from scratch.

When this tutorial is completed a program will be able to identify and draw boxes around specific objects in pictures, videos, or in a webcam feed.

Also, this tutorial provides instructions for training a classifier that can detect multiple objects.

The tutorial is written for Windows 10. The general procedure can also be used for Linux, but file paths and package installation commands will need to be changed accordingly.

In this tutorial TensorFlow-GPU v1.5 is used, but it will likely work for future versions of TensorFlow.

TensorFlow-GPU allows your PC to use the GPU (Graphics Processing Unit) to provide extra processing power while training, so it will be used for this tutorial.

Using TensorFlow-GPU instead of regular TensorFlow reduces training time by a factor of about 8 (3 hours to train instead of 24 hours).

The CPU-only version of TensorFlow can also be used for this tutorial, but it will take longer.

If you use CPU-only TensorFlow, you do not need to install CUDA and cuDNN in Step 1.


The machine used for this tutorial had:

  • RAM: 16 GB
  • CPU: i7 Intel
  • Disk: 500GB
  • GPU: Nvidia GeForce 1060 6GB

Operating system

The operating system used in this tutorial is Windows10.

But this tutorial could be done on Linux as well.

Steps for installing TensorFlow GPU

It is very important to download the correct versions of the packages described in this tutorial otherwise the installation will not work!


  • Nvidia GPU (GTX 650 or newer)
  • CUDA Toolkit v9.0
  • CuDNN v7.0.5
  • Anaconda (optional but recommended)

1. Install CUDA Toolkit

CUDA from NVIDIA is a parallel computing platform and programming model that enables dramatic increases in computing performance by harnessing the power of the graphics processing unit (GPU).

Follow this link to download and install CUDA Toolkit v9.0 for Windows10.

2. Install CUDNN

NVIDIA CUDA Deep Neural Network (cuDNN) is a GPU-accelerated library of primitives for deep neural networks. It provides highly tuned implementations of routines arising frequently in DNN applications.

  • Go to https://developer.nvidia.com/rdp/cudnn-download
  • Create a user profile if needed and log in
  • Select cuDNN v7.0.5 (Feb 28, 2018), for CUDA 9.0
  • Download cuDNN v7.0.5 Library for Windows 10
  • Extract the contents of the zip file (i.e. the folder named cuda) inside <INSTALL_PATH>\NVIDIA GPU Computing Toolkit\CUDA\v9.0\,
    where <INSTALL_PATH> points to the installation directory specified during the installation of the CUDA Toolkit.
    By default <INSTALL_PATH> = C:\Program Files.

3. Install Anaconda

For this tutorial we will use Anaconda.

Anaconda is a Python-based data processing and scientific computing platform. It has built in many very useful third-party libraries.

Download Anaconda from: https://www.anaconda.com/download/

4. Environment Setup

In Windows Go to Start and search for “environment variables”

  • Click the Environment Variables button
  • Click on the Path system variable and select edit
  • Add the following paths:
    • <INSTALL_PATH>\NVIDIA GPU Computing Toolkit\CUDA\v9.0\bin
    • <INSTALL_PATH>\NVIDIA GPU Computing Toolkit\CUDA\v9.0\libnvvp
    • <INSTALL_PATH>\NVIDIA GPU Computing Toolkit\CUDA\v9.0\extras\CUPTI\libx64
    • <INSTALL_PATH>\NVIDIA GPU Computing Toolkit\CUDA\v9.0\cuda\bin

Env var.png

5. Create a new Conda virtual environment

Open a new Anaconda/Command Prompt window as Administrator.

Type the following command:

conda create -n labx pip python=3.5

Now activate the newly created virtual environment by typing the following in the Anaconda Promt window:

conda activate labx

6. Install TensorFlow GPU for Python

Type the following on the command line:

(labx) C:> python -m pip install --upgrade pip
(labx) C:> pip install --ignore-installed --upgrade tensorflow-gpu==1.9

6.1 Test your Installation

Start a new Python interpreter session by typing:

(labx) C:> python


>>> import tensorflow as tf

Then type:

>>> hello = tf.constant('Hello, TensorFlow!')
>>> sess = tf.Session()

Finally type:

>>> print(sess.run(hello))
b'Hello, TensorFlow!'

7. Install TensorFlow Models

Now that you have installed TensorFlow, it is time to install the models used by TensorFlow.

7.1 Install Prerequisites

Prerequisite packages:

Name Version
pillow 5.4.1-py36hdc69c19_0
lxml 4.3.1-py36h1350720_0
jupyter 1.0.0-py36_7
matplotlib 3.0.2-py36hc8f65d3_0
opencv 3.4.2-py36h40b0b35_0
setuptools 39.1.0


(labx) C:> pip install pillow lxml jupyter matplotlib opencv-python pandas

8. Clone the TensorFlow Models from Github

Create a new directory e.g. labx.


(labx) C:> cd C:\labx

Clone Tensorflow model with git:


(labx) C:\labx> git clone https://github.com/tensorflow/models.git
(labx) C:\labx> cd models

Check out the commit 4b566d4e800ff82579eda1f682f9ce7aa8792ea8.


(labx) C:\labx\models> git checkout 4b566d4e800ff82579eda1f682f9ce7aa8792ea8

9. Compile Protobufs and run setup.py


(labx) C:> cd C:\labx\models\research


(labx) C:\labx\models\research> protoc --python_out=. .\object_detection\protos\anchor_generator.proto .\object_detection\protos\argmax_matcher.proto .\object_detection\protos\bipartite_matcher.proto 
.\object_detection\protos\box_coder.proto .\object_detection\protos\box_predictor.proto .\object_detection\protos\eval.proto .\object_detection\protos\faster_rcnn.proto 
.\object_detection\protos\faster_rcnn_box_coder.proto .\object_detection\protos\grid_anchor_generator.proto .\object_detection\protos\hyperparams.proto 
.\object_detection\protos\image_resizer.proto .\object_detection\protos\input_reader.proto .\object_detection\protos\losses.proto .\object_detection\protos\matcher.proto 
.\object_detection\protos\mean_stddev_box_coder.proto .\object_detection\protos\model.proto .\object_detection\protos\optimizer.proto .\object_detection\protos\pipeline.proto 
.\object_detection\protos\post_processing.proto .\object_detection\protos\preprocessor.proto .\object_detection\protos\region_similarity_calculator.proto 
.\object_detection\protos\square_box_coder.proto .\object_detection\protos\ssd.proto .\object_detection\protos\ssd_anchor_generator.proto .\object_detection\protos\string_int_label_map.proto 
.\object_detection\protos\train.proto .\object_detection\protos\keypoint_box_coder.proto

This creates a name_pb2.py file from every name.proto file in the \object_detection\protos folder.

Type the following commands from the C:\labx\models\research directory:

(labx) C:\labx\models\research> python setup.py build
(labx) C:\labx\models\research> python setup.py install

10. Adding necessary Environment Variables

On Windows10 the following folder must be added to your PYTHONPATH environment variable (See Environment Setup):

  • <PATH_TO_TF>\TensorFlow\models\research\object_detection
  • <PATH_TO_TF>\TensorFlow\models\research
  • <PATH_TO_TF>\TensorFlow\models\research\slim

Env var2.png

11. Test TensorFlow setup to verify it works

The TensorFlow Object Detection API is now set up to use pre-trained models for object detection, or to train a new one.

You can test it out and verify your installation is working.


(labx) C:\labx\models\research\object_detection> jupyter notebook object_detection_tutorial.ipynb

This opens a Jupyter page on your default web browser and allows you to step through the code one section at a time.

Run each section by clicking the "Run" button in the upper toolbar.

Once you have stepped all the way through the script, you should see two labeled images at the bottom section the page.

If you see this, then everything is working properly!

Jupyter notebook dogs.jpg

12. Download the Faster-RCNN-Inception-V2-COCO model from TensorFlow's model zoo

In this tutorial we will use a the pre-trained model faster_rcnn_inception_v2_coco_2018_01_28.

For more information about this model please visit: this page

This model is fast enough on Raspberry Pi.

Download the model here

Extract the faster_rcnn_inception_v2_coco_2018_01_28 folder to the C:\labx\models\research\object_detection folder.

13. Download tutorial's repository from GitHub

We will use the structure and scripts from another tutorial (see references).

You can download the zip file or clone the GitHub repository from here.

If you download the zip file extract all the contents directly into the C:\labx\models\research\object_detection.

Or if you prefer to clone the repository:


(labx) C:\labx\models\research\object_detection>git clone https://github.com/EdjeElectronics/TensorFlow-Object-Detection-API-Tutorial-Train-Multiple-Objects-Windows-10.git

We will train our own object detector, therefore we need to delete the following files (do not delete the folders):

  • All files in \object_detection\images\train
  • All files in \object_detection\images\test
  • The “test_labels.csv” files in \object_detection\images
  • The “train_labels.csv” files in \object_detection\images
  • All files in \object_detection\training
  • All files in \object_detection\inference_graph

14. Gather and Label Pictures

Now that the TensorFlow Object Detection API is all set up and ready to go, we need to provide the images it will use to train a new detection classifier.

14.1 Gather Pictures

We will train our model to recognize two objects: an orange cylinder and an blue cube.

Cube2.JPG Cylinder2.JPG

To take pictures you can simply use your mobile phone.

TensorFlow needs hundreds of images of an object to train a good detection classifier.

To train a robust classifier, the training images should have random objects in the image along with the desired objects, and should have a variety of backgrounds and lighting conditions.

There should be some images where the desired object is partially obscured, overlapped with something else, or only halfway in the picture.

Make sure the images aren’t too large. They should be less than 200KB each, and their resolution shouldn’t be more than 720x1280.

The larger the images are, the longer it will take to train the classifier.

You can use the resizer.py script in this repository to reduce the size of the images.

After you have all the pictures you need, move:

  • 20% of them to the \object_detection\images\test directory
  • 80% of them to the \object_detection\images\train directory.

Make sure there are a variety of pictures in both the \test and \train directories

14.2 Label Pictures

LabelImg is a great tool for labeling images, and its GitHub page has very clear instructions on how to install and use it.

LabelImg GitHub link

LabelImg download link

LabelImg saves a .xml file containing the label data for each image.

These .xml files will be used to generate TFRecords, which are one of the inputs to the TensorFlow trainer.

Once you have labeled and saved each image, there will be one .xml file for each image in the \test and \train directories.

15. Generate Training Data

With the images labeled, it’s time to generate the TFRecords that serve as input data to the TensorFlow training model.

This tutorial uses the xml_to_csv.py and generate_tfrecord.py scripts.

First, the image .xml data will be used to create .csv files containing all the data for the train and test images.

From the \object_detection folder, type the following command in the Anaconda command prompt:

(labx) C:\labx\models\research\object_detection> python xml_to_csv.py

This creates a train_labels.csv and test_labels.csv file in the \object_detection\images folder.

Next, open the \object_detection\generate_tfrecord.py file in a text editor.

Replace the label map starting at line 31 with your own label map, where each object is assigned an ID number.

# TO-DO replace this with label map
def class_text_to_int(row_label):
   if row_label == 'blue_cube':
       return 1
   elif row_label == 'orange_cylinder':
       return 2
       return 0


Then, generate the TFRecord files by typing these commands from the \object_detection folder:

python generate_tfrecord.py --csv_input=images\train_labels.csv --image_dir=images\train --output_path=train.record
python generate_tfrecord.py --csv_input=images\test_labels.csv --image_dir=images\test --output_path=test.record

16. Create Label Map and Configure Training

The last thing to do before training is to create a label map and edit the training configuration file.

16.1 Label map

The label map tells the trainer what each object is by defining a mapping of class names to class ID numbers.

Create a new file and save it as labelmap.pbtxt in the C:\labx\models\research\object_detection\training folder.

In the text editor, copy or type in the label map in the format below:

item {
  id: 1
  name: 'blue_cube'

item {
  id: 2
  name: 'orange_cylinder'

The label map ID numbers should be the same as what is defined in the generate_tfrecord.py file.

16.2 Configure training

Finally, the object detection training pipeline must be configured. It defines which model and what parameters will be used for training.

Navigate to C:\labx\models\research\object_detection\samples\configs and copy the faster_rcnn_inception_v2_pets.config file into the \object_detection\training directory.

Make the following changes:

  • Line 9. Change num_classes to the number of different objects you want the classifier to detect. In our case It will be num_classes : 2.
  • Line 106. Change fine_tune_checkpoint to:
    • fine_tune_checkpoint : "C:/labx/models/research/object_detection/faster_rcnn_inception_v2_coco_2018_01_28/model.ckpt"
  • Lines 123 and 125. In the train_input_reader section, change input_path and label_map_path to:
    • input_path : "C:/labx/models/research/object_detection/train.record"
    • label_map_path: "C:/labx/models/research/object_detection/training/labelmap.pbtxt"
  • Line 130. Change num_examples to the number of images you have in the \images\test directory.
  • Lines 135 and 137. In the eval_input_reader section, change input_path and label_map_path to:
    • input_path : "C:/labx/models/research/object_detection/test.record"
    • label_map_path: "C:/labx/models/research/object_detection/training/labelmap.pbtxt"


17. Run the Training

From the \object_detection directory, issue the following command to begin training:

python train.py --logtostderr --train_dir=training/ --pipeline_config_path=training/faster_rcnn_inception_v2_pets.config


17.1 View the progress of the training

You can view the progress of the training job by using TensorBoard.

Open a new instance of Anaconda Prompt, activate the tensorflow virtual environment, change to the C:\labx\models\research\object_detection directory, and issue the following command:

(labx) C:\labx\models\research\object_detection>tensorboard --logdir=training


This will create a webpage on your local machine at YourPCName:6006, which can be viewed through a web browser.

The TensorBoard page provides information and graphs that show how the training is progressing.

One important graph is the Loss graph, which shows the overall loss of the classifier over time.

Export Inference Graph

Now that training is complete, the last step is to generate the frozen inference graph (.pb file).

From the \object_detection folder, issue the following command, where “XXXX” in “model.ckpt-XXXX” should be replaced with the highest-numbered .ckpt file in the training folder:

python export_inference_graph.py --input_type image_tensor --pipeline_config_path training/faster_rcnn_inception_v2_pets.config --trained_checkpoint_prefix training/model.ckpt-XXXX -- output_directory inference_graph

This creates a frozen_inference_graph.pb file in the \object_detection\inference_graph folder. The .pb file contains the object detection classifier.

Use Your Newly Trained Object Detection Classifier!

To test your object detector, move a picture of the object or objects into the \object_detection folder, and change the IMAGE_NAME variable in the Object_detection_image.py to match the file name of the picture.

To run any of the scripts, type “idle” in the Anaconda Command Prompt (with the “labx” virtual environment activated) and press ENTER.

This will open IDLE, and from there, you can open any of the scripts and run them.

The script will detect the object in the picture like this:

Detect object2.png

Next step

Next step is to use the model in a Raspberry Pi 4 to detect objects with a webcam.

Please refer to Install Tensorflow on Raspberry Pi


Link to PDF


Personal tools
MediaWiki Appliance - Powered by TurnKey Linux