Added Inference graph notebooks

Qberto · Oct 20, 2017 · 4132dc0 · 4132dc0
1 parent 97e3799
commit 4132dc0
Show file tree

Hide file tree

Showing 2 changed files with 667 additions and 0 deletions.
diff --git a/object_detection_CAFO_screencap.ipynb b/object_detection_CAFO_screencap.ipynb
@@ -0,0 +1,338 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Object Detection Demo\n",
+    "Welcome to the object detection inference walkthrough!  This notebook will walk you step by step through the process of using a pre-trained model to detect objects in an image. Make sure to follow the [installation instructions](https://github.com/tensorflow/models/blob/master/object_detection/g3doc/installation.md) before you start."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Imports"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {
+    "collapsed": true,
+    "scrolled": true
+   },
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "import os\n",
+    "import six.moves.urllib as urllib\n",
+    "import sys\n",
+    "import tarfile\n",
+    "import tensorflow as tf\n",
+    "import zipfile\n",
+    "\n",
+    "from PIL import ImageGrab\n",
+    "import time\n",
+    "\n",
+    "from collections import defaultdict\n",
+    "from io import StringIO\n",
+    "from matplotlib import pyplot as plt\n",
+    "from PIL import Image\n",
+    "\n",
+    "import cv2"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Env setup"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "# This is needed to display the images.\n",
+    "%matplotlib inline\n",
+    "\n",
+    "# This is needed since the notebook is stored in the object_detection folder.\n",
+    "sys.path.append(\"..\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Object detection imports\n",
+    "Here are the imports from the object detection module."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "from utils import label_map_util\n",
+    "\n",
+    "from utils import visualization_utils as vis_util"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Model preparation "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Variables\n",
+    "\n",
+    "Any model exported using the `export_inference_graph.py` tool can be loaded here simply by changing `PATH_TO_CKPT` to point to a new .pb file.  \n",
+    "\n",
+    "By default we use an \"SSD with Mobilenet\" model here. See the [detection model zoo](https://github.com/tensorflow/models/blob/master/object_detection/g3doc/detection_model_zoo.md) for a list of other models that can be run out-of-the-box with varying speeds and accuracies."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "# What model to download.\n",
+    "MODEL_NAME = 'cafo_graph'\n",
+    "\n",
+    "# Path to frozen detection graph. This is the actual model that is used for the object detection.\n",
+    "PATH_TO_CKPT = MODEL_NAME + '/frozen_inference_graph.pb'\n",
+    "\n",
+    "# List of the strings that is used to add correct label for each box.\n",
+    "PATH_TO_LABELS = os.path.join('training', 'object-detection.pbtxt')\n",
+    "\n",
+    "NUM_CLASSES = 1"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Download Model"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Load a (frozen) Tensorflow model into memory."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "detection_graph = tf.Graph()\n",
+    "with detection_graph.as_default():\n",
+    "  od_graph_def = tf.GraphDef()\n",
+    "  with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:\n",
+    "    serialized_graph = fid.read()\n",
+    "    od_graph_def.ParseFromString(serialized_graph)\n",
+    "    tf.import_graph_def(od_graph_def, name='')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Loading label map\n",
+    "Label maps map indices to category names, so that when our convolution network predicts `5`, we know that this corresponds to `airplane`.  Here we use internal utility functions, but anything that returns a dictionary mapping integers to appropriate string labels would be fine"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "label_map = label_map_util.load_labelmap(PATH_TO_LABELS)\n",
+    "categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True)\n",
+    "category_index = label_map_util.create_category_index(categories)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Helper code"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "def load_image_into_numpy_array(image):\n",
+    "  (im_width, im_height) = image.size\n",
+    "  return np.array(image.getdata()).reshape(\n",
+    "      (im_height, im_width, 3)).astype(np.uint8)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Detection"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "# For the sake of simplicity we will use only 2 images:\n",
+    "# image1.jpg\n",
+    "# image2.jpg\n",
+    "# If you want to test the code with your images, just add path to the images to the TEST_IMAGE_PATHS.\n",
+    "PATH_TO_TEST_IMAGES_DIR = 'test_images'\n",
+    "# PATH_TO_TEST_IMAGES_DIR = 'custom_images'\n",
+    "TEST_IMAGE_PATHS = [ os.path.join(PATH_TO_TEST_IMAGES_DIR, 'image{}.jpg'.format(i)) for i in range(7, 8) ]\n",
+    "\n",
+    "# Size, in inches, of the output images.\n",
+    "IMAGE_SIZE = (12, 8)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "def cafo_in_image(classes_arr, scores_arr, obj_thresh=5):\n",
+    "    stacked_arr = np.stack((classes_arr, scores_arr), axis=-1)\n",
+    "    cafo_found_flag = False\n",
+    "    for ix in range(obj_thresh):\n",
+    "        if 1.00000000e+00 in stacked_arr[ix]:\n",
+    "            cafo_found_flag = True\n",
+    "            \n",
+    "    return cafo_found_flag"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "metadata": {
+    "scrolled": false
+   },
+   "outputs": [
+    {
+     "ename": "OSError",
+     "evalue": "screen grab failed",
+     "output_type": "error",
+     "traceback": [
+      "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
+      "\u001b[1;31mOSError\u001b[0m                                   Traceback (most recent call last)",
+      "\u001b[1;32m<ipython-input-10-0399c7e2518c>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m()\u001b[0m\n\u001b[0;32m     13\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m     14\u001b[0m             \u001b[1;31m# Expand dimensions since the model expects images to have shape: [1, None, None, 3]\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m---> 15\u001b[1;33m             \u001b[0mimage_np\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mnp\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0marray\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mImageGrab\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mgrab\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mbbox\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;36m0\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;36m40\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;36m800\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;36m640\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m     16\u001b[0m \u001b[1;31m#             image_np = np.array(ImageGrab.grab(bbox=(30,100,900,640)))\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m     17\u001b[0m             \u001b[0mimage_np_expanded\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mnp\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mexpand_dims\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mimage_np\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0maxis\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;36m0\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
+      "\u001b[1;32mC:\\ProgramData\\Anaconda3\\lib\\site-packages\\PIL\\ImageGrab.py\u001b[0m in \u001b[0;36mgrab\u001b[1;34m(bbox)\u001b[0m\n\u001b[0;32m     39\u001b[0m         \u001b[0mos\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0munlink\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mfilepath\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m     40\u001b[0m     \u001b[1;32melse\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m---> 41\u001b[1;33m         \u001b[0msize\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mdata\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mgrabber\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m     42\u001b[0m         im = Image.frombytes(\n\u001b[0;32m     43\u001b[0m             \u001b[1;34m\"RGB\"\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0msize\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mdata\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
+      "\u001b[1;31mOSError\u001b[0m: screen grab failed"
+     ]
+    }
+   ],
+   "source": [
+    "with detection_graph.as_default():\n",
+    "    with tf.Session(graph=detection_graph) as sess:\n",
+    "        # Definite input and output Tensors for detection_graph\n",
+    "        image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')\n",
+    "        # Each box represents a part of the image where a particular object was detected.\n",
+    "        detection_boxes = detection_graph.get_tensor_by_name('detection_boxes:0')\n",
+    "        # Each score represent how level of confidence for each of the objects.\n",
+    "        # Score is shown on the result image, together with the class label.\n",
+    "        detection_scores = detection_graph.get_tensor_by_name('detection_scores:0')\n",
+    "        detection_classes = detection_graph.get_tensor_by_name('detection_classes:0')\n",
+    "        num_detections = detection_graph.get_tensor_by_name('num_detections:0')\n",
+    "        while True:\n",
+    "            \n",
+    "            # Expand dimensions since the model expects images to have shape: [1, None, None, 3]\n",
+    "            image_np = np.array(ImageGrab.grab(bbox=(0,40,800,640)))\n",
+    "#             image_np = np.array(ImageGrab.grab(bbox=(30,100,900,640)))\n",
+    "            image_np_expanded = np.expand_dims(image_np, axis=0)\n",
+    "            # Actual detection.\n",
+    "            (boxes, scores, classes, num) = sess.run(\n",
+    "              [detection_boxes, detection_scores, detection_classes, num_detections],\n",
+    "              feed_dict={image_tensor: image_np_expanded})\n",
+    "            # Visualization of the results of a detection.\n",
+    "            vis_util.visualize_boxes_and_labels_on_image_array(\n",
+    "              image_np,\n",
+    "              np.squeeze(boxes),\n",
+    "              np.squeeze(classes).astype(np.int32),\n",
+    "              np.squeeze(scores),\n",
+    "              category_index,\n",
+    "              use_normalized_coordinates=True,\n",
+    "              line_thickness=8, min_score_thresh=0.8)\n",
+    "            \n",
+    "            cv2.imshow('object detection', cv2.resize(image_np, (800,600)))\n",
+    "            if cv2.waitKey(25) & 0xFF == ord('q'):\n",
+    "                cv2.destroyAllWindows()\n",
+    "                break\n",
+    "                \n",
+    "            cafo_found = cafo_in_image(np.squeeze(classes).astype(np.int32), \n",
+    "                                           np.squeeze(scores), \n",
+    "                                           obj_thresh=2)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.6.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}