2

I am trying to generate a bounding box for object detection in an image. I read the image and generate a binary 2d numpy array such as:

array([[0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0],
       [0, 0, 1, 1, 0, 0],
       [0, 0, 1, 1, 0, 0],
       [0, 0, 1, 1, 0, 0],
       [0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0]])

The 1 represent pixels that are to be within the bounding box in the image. How can I get the x,y coordinates of the top left point, and then the length of x,y?

daufoi
  • 21
  • 3

1 Answers1

2

Check this simple code:

import numpy as np

a = np.array(
       [[0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0],
        [0, 0, 1, 1, 0, 0],
        [0, 0, 1, 1, 0, 0],
        [0, 0, 1, 1, 0, 0],
        [0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0]])

x,y = np.where(a)
top_left = x.min(), y.min()
bottom_right = x.max(), y.max()
Julien
  • 13,986
  • 5
  • 29
  • 53
  • I think it should be `y,x = np.where(...` – P i Oct 18 '21 at 07:35
  • how you call your coordinates is really a convention, you can call then `f` and `k` too if you prefer... – Julien Oct 18 '21 at 08:29
  • If you _are_ going to name your variables `x` and `y`, opposing the universal "x/y axis <->horizontal/vertical location" convention is perverse/counterproductive. Grateful for the answer, saved some time. But introduced a bug as `top_left` self-documents to `(horiz, vert)` coords while actually storing `(vert, horiz)`. `knick_knock_paddywhack` and `give_a_dog_a_bone` would be infinitely better choices. – P i Oct 19 '21 at 02:00
  • `numpy` arrays are COLUMN-major. So the first index represents vertical location within the array, as it is written in your answer. – P i Oct 19 '21 at 02:04
  • So I guess you should tell the numpy developpers that they chose the wrong convention as well by putting the 'y' before the 'x', oh and also the entire math community that the rows in a matrix should be stacked upwards rather than downwards... :) – Julien Oct 19 '21 at 03:37