How can I use the gluon-cv model_zoo and output to an OpenCV window with Python?

Question

My code is:

import gluoncv as gcv

net = gcv.model_zoo.get_model('ssd_512_mobilenet1.0_voc', pretrained=True)

windowName = "ssdObject"
cv2.namedWindow(windowName, cv2.WINDOW_NORMAL)
cv2.resizeWindow(windowName, 1280, 720)
cv2.moveWindow(windowName, 0, 0)
cv2.setWindowTitle(windowName, "SSD Object Detection")
while True:
    # Check to see if the user closed the window
    if cv2.getWindowProperty(windowName, 0) < 0:
        # This will fail if the user closed the window; Nasties get printed to the console
        break
    ret_val, frame = video_capture.read()

    frame = mx.nd.array(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)).astype('uint8')
    rgb_nd, frame = gcv.data.transforms.presets.ssd.transform_test(frame, short=512, max_size=700)

    # # Run frame through network
    class_IDs, scores, bounding_boxes = net(rgb_nd)

    displayBuf = frame
    cv2.imshow(windowName, displayBuf)
    cv2.waitKey(0)

I somehow need to draw the bounding_codes, class_IDs, and scores onto the image and output it via imshow.

How can I accomplish this?

Hmm. Not familiar with the lib, but do I guess correctly that `class_IDs, scores, bounding_boxes` are 3 arrays of the same length, tied together by index ids? i.e. each bounding box has an associated class_ID and score? | If so (unless there's some pre-made rendering function for this in gluoncv).... maybe just loop over the arrays and use the [primitive drawing functions](https://docs.opencv.org/3.1.0/dc/da5/tutorial_py_drawing_functions.html) to draw the rectangle and two texts, maybe with randomized colours... — Dan Mašek, Jan 23 '19 at 19:03
Hmm, maybe [this](https://gluon-cv.mxnet.io/api/utils.html#gluoncv.utils.viz.plot_bbox) is the pre-made one? — Dan Mašek, Jan 23 '19 at 19:09
This looks like it uses the matlab plot rather than the `OpenCV` window @DanMašek — Shamoon, Jan 23 '19 at 19:26
Right... so I guess either cook up your own simple renderer or have matplotlib output into an in-memory image and display that with `imshow`... although that seems a bit over the top. Drawing it yourself should be so bad... the worst thing I can see there is fiddling about with positioning/size of the text to make it look reasonable with various sizes of bounding boxes. — Dan Mašek, Jan 23 '19 at 19:33
Could you reduce this to running on a single input image (and provide it as PNG), and a sample `class_IDs, scores, bounding_boxes` values? Then I can cook up a solution without installing mxnet and finding/figuring out what video to use. | BTW, you probably should store the original `frame` in BGR format, if you want to use it for the visualization. (or, how does `gcv.data.transforms.presets.ssd.transform_test` modify the frame?) — Dan Mašek, Jan 23 '19 at 21:09

Kinght 金 · Accepted Answer · 2019-01-24T02:30:39.497

We can use ssd|yolo (wroted by mxnet|keras|pytorch) to detect the objects in the image. Then we will get the result as a form of classids/scores/bboxes. Iterator the result, do some transform, then just drawing in OpenCV will be OK.

(Poor English, but I think you can get me in the following code).

This is the source image:

This the result displayed in OpenCV:

#!/usr/bin/python3
# 2019/01/24 09:05
# 2019/01/24 10:25

import gluoncv as gcv
import mxnet as mx
import cv2
import numpy as np
# https://github.com/pjreddie/darknet/blob/master/data/dog.jpg

## (1) Create network 
net = gcv.model_zoo.get_model('ssd_512_mobilenet1.0_voc', pretrained=True)

## (2) Read the image and preprocess 
img = cv2.imread("dog.jpg")
rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

xrgb = mx.nd.array(rgb).astype('uint8')
rgb_nd, xrgb = gcv.data.transforms.presets.ssd.transform_test(xrgb, short=512, max_size=700)

## (3) Interface 
class_IDs, scores, bounding_boxes = net(rgb_nd)

## (4) Display 
for i in range(len(scores[0])):
    #print(class_IDs.reshape(-1))
    #print(scores.reshape(-1))
    cid = int(class_IDs[0][i].asnumpy())
    cname = net.classes[cid]
    score = float(scores[0][i].asnumpy())
    if score < 0.5:
        break
    x,y,w,h = bbox =  bounding_boxes[0][i].astype(int).asnumpy()
    print(cid, score, bbox)
    tag = "{}; {:.4f}".format(cname, score)
    cv2.rectangle(img, (x,y), (w, h), (0, 255, 0), 2)
    cv2.putText(img, tag, (x, y-20),  cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0,0,255), 1)

cv2.imshow("ssd", img);
cv2.waitKey()

score 1 · Answer 2 · answered Sep 19 '19 at 03:57

1

GluonCV recently has included the visualization function with OpenCV.

To call these functions, you just add a cv_ prefix to your already using function. For example using cv_plot_bbox instead of plot_bbox.

answered Sep 19 '19 at 03:57

TomHall

286
3
15

How can I use the gluon-cv model_zoo and output to an OpenCV window with Python?

2 Answers2