How to convert cv2.rectangle bounding box to YoloV4 annotation format (relative x,y,w,h)?

Question

I have trained a Yolo4 network and it is giving me bounding boxes as:

img_array = cv2.cvtColor(cv2.imread('image.png'), cv2.COLOR_BGR2RGB)
classes, scores, bboxes = model.detect(img_array, CONFIDENCE_THRESHOLD, NMS_THRESHOLD)

box = bboxes[0]

(x, y) = (box[0], box[1])
(w, h) = (box[2], box[3])

When I save the image by using cv2.rectangle as:

cv2.rectangle(img_array, (x, y), (x + w, y + h), (127,0,75), 1)

cv2.imwrite('image.png',img_array)

IT gives me a very good bounding box plotted. I want to use this box and shape of image array to create a text file which is in the Yolov4 format as x,y,w,h floating values between 0 and 1 relative to image size.

Let us suppose I have my values as:

img_array.shape -> (443, 1265, 3)
box -> array([489, 126, 161, 216], dtype=int32)

So it gives me

(x, y) = (box[0], box[1]) -> (489, 126)
(w, h) = (box[2], box[3]) -> (161, 216)

Also the Bounding Boxes created by me using LabelImg in the Text file are as

0.453125 0.538462 0.132212 0.509615 # 0 is the class

How can I use these coordinates to get in Yolov4 format? It is a bit confusing. I have used many codes from this answer does not seem to work.

Also, I tried using this code but I don't know if that's right or not. Even if that's right, I have no idea how to get x_, y_

def yolov4_format(img_shape,box):
    x_img, y_img, c = img_shape
    (x, y) = (box[0], box[1])
    (w, h) = (box[2], box[3])
    
    x_, y_ = None # logic for these?
    w_ = w/x_img
    h_ = h/y_img
    return x_,y_, w_, h_

score 1 · Answer 1 · answered Apr 15 '21 at 17:19

Guess I was close to solving just the x and y are NOT absolute but the Center of the rectangle box as described by AlexyAB in this answer. So I followed up the code for LabelImg and found a code and modified it to my usecase.

def bnd_box_to_yolo_line(box,img_size):
        (x_min, y_min) = (box[0], box[1])
        (w, h) = (box[2], box[3])
        x_max = x+w
        y_max = y+h
        
        x_center = float((x_min + x_max)) / 2 / img_size[1]
        y_center = float((y_min + y_max)) / 2 / img_size[0]

        w = float((x_max - x_min)) / img_size[1]
        h = float((y_max - y_min)) / img_size[0]

        return x_center, y_center, w, h

All you need is that Bounding Box and Image shape

score 0 · Answer 2 · answered May 01 '22 at 09:21

There is a more straight-forward way to do those stuff with pybboxes. Install with,

pip install pybboxes

In your case,

import pybboxes as pbx

voc_bbox = (489, 126, 161, 216)
W, H = 443, 1265  # WxH of the image
pbx.convert_bbox(voc_bbox, from_type="coco", to_type="yolo", image_width=W, image_height=H)
>>> (1.2855530474040633, 0.18498023715415018, 0.36343115124153497, 0.1707509881422925)

Note that, converting to YOLO format requires the image width and height for scaling.

How to convert cv2.rectangle bounding box to YoloV4 annotation format (relative x,y,w,h)?

2 Answers2