1

Background:

I'm playing around with Google's body segmentation API for Python. Since it's a library originally written for js (tensorflow.js), the python equivalent seems to be quite limited. So the only way to get body segmentation appears to be comparing the generated masks' RGB colors to what body part that color should represent. i.e:

body segmentation colors

the torso should be green, so I know that :

torso = np.array([175, 240, 91]) 

The right part of the head should be purple so it's [110, 64, 170], and so on...

My approach:

# get prediction result
img = "front_pic"
img_filename = img + ".png"
image = tf.keras.preprocessing.image.load_img(img_filename)
image_array = tf.keras.preprocessing.image.img_to_array(image)
result = bodypix_model.predict_single(image_array)

# simple mask
mask = result.get_mask(threshold=0.75)

# colored mask (separate colour for each body part)
colored_mask = result.get_colored_part_mask(mask)
tf.keras.preprocessing.image.save_img(img+'_cmask'+'.jpg',
    colored_mask
)

# color codes
right_head = np.array([110, 64, 170])
left_head = np.array([143, 61, 178])
torso = np.array([175, 240, 91])
left_feet = np.array([84, 101, 214])
right_feet = np.array([99, 81, 195])
left_arm_shoulder = np.array([210, 62, 167])
right_arm_shoulder = np.array([255, 78, 125])

# (x,y) coordinates
coordinate_x = 0
coordinate_y = 0
for vertical_pixels in colored_mask:
    coordinate_y = coordinate_y + 1
    #if coordinate_y > height:
    #    coordinate_y = 0
    for pixels in vertical_pixels:
        coordinate_x = coordinate_x + 1
        if coordinate_x > width:
            coordinate_x = 1
        # Current Pixel
        np_pixels = np.array(pixels)
        current_coordinate = np.array([[coordinate_x,coordinate_y]])
        #print(current_coordinate)
        if np.array_equal(np_pixels,right_head) or np.array_equal(np_pixels,left_head): # right head or left head
            pixels_head = pixels_head + 1
            head_coordinates = np.concatenate((head_coordinates,current_coordinate),axis=0) # Save coordinates
        if np.array_equal(np_pixels,torso): # Torso
            torso_pixels = torso_pixels + 1
            torso_coordinates = np.concatenate((torso_coordinates,current_coordinate),axis=0) # Save coordinates
        if np.array_equal(np_pixels,left_feet) or np.array_equal(np_pixels,right_feet): # feet_pixels
            feet_pixels = feet_pixels + 1
            feet_coordinates = np.concatenate((feet_coordinates,current_coordinate),axis=0) # Save coordinates
        if np.array_equal(np_pixels,left_arm_shoulder): # left_arm_shoulder
            left_arm_shoulder_pixels = left_arm_shoulder_pixels + 1
            left_arm_shoulder_coordinates = np.concatenate((left_arm_shoulder_coordinates,current_coordinate),axis=0) # Save coordinates
        if np.array_equal(np_pixels,right_arm_shoulder): # right_arm_shoulder
            right_arm_shoulder_pixels = right_arm_shoulder_pixels + 1
            right_arm_shoulder_coordinates = np.concatenate((right_arm_shoulder_coordinates,current_coordinate),axis=0) # Save coordinates

The problem:

The problem with my approach, is that it's super slow! For instance, these lines of code:

if np.array_equal(np_pixels,torso): # Torso

Take a lot of execution time. It's too slow having to compare each pixel to it's RGB equivalent.

My question

What's the best solution? So either:

  1. There's a better way within the python-tf-bodypix library/API to get the segmented body parts' pixels' coordinates. (Anyone know if such method exists within the bodypix library?)

or...

  1. Any better/faster approach to comparing two numpy arrays?

  2. Any other inefficient code you see in my approach that I should change?

Luis Cruz
  • 1,488
  • 3
  • 22
  • 50
  • Could you try creating a mask of the array where the values are equal to the target values, then use the mask to get the values/pixels you want? https://numpy.org/doc/stable/reference/generated/numpy.ma.masked_where.html – Eric B May 24 '22 at 00:18
  • The masking technique may not be the right approach to the solution, however, I believe there should be faster built-in methods to compare the numpy arrays as a whole instead of having a for loop for each dimension. – Eric B May 24 '22 at 00:39
  • Possible duplicate of: https://stackoverflow.com/questions/12138339/finding-the-x-y-indexes-of-specific-r-g-b-color-values-from-images-stored-in – Eric B May 24 '22 at 01:08

2 Answers2

1

From the answer: Finding the (x,y) indexes of specific (R,G,B) color values from images stored in NumPy ndarrays

The solution to your problem would be:

cords = list(zip(*np.where(np.all(np_pixels == torso, axis=-1))))
Eric B
  • 152
  • 1
  • 11
1

You can use the fact that each of RGB triplet sum to a different value.

right_head = np.array([110, 64, 170]) # SUM = 344
left_head  = np.array([143, 61, 178]) # SUM = 382
...

So you can sum your pixel value along the RGB dimension:

x = np.sum(colored_mask,axis=0)

And create a vector containing all the different possible sum, that correspond to a body part:

val = np.array([344,382,506,399,375,439,458]) # [right_head_sum, left_head_sum...]

Then compare those value by using broadcasting:

compare = x == val[:,None,None]

Count how many pixel you have in your 7 differents categories:

count = np.sum(compare,axis=(1,2))

And you can use np.where and np.split to retrieve the coordinate:

coord3d = np.where(compare)
split = np.where(np.diff(coord3d[0]))[0]+1
coord_x = np.split(coord3d[1],split)
coord_y = np.split(coord3d[2],split)

Example with a 3x5x5 image:

x = array([[375, 399, 458, 382, 506],
          [375, 382, 506, 458, 382],
          [439, 344, 382, 344, 375],
          [439, 439, 344, 382, 344],
          [382, 399, 506, 399, 382]])

val = array([344, 382, 506, 399, 375, 439, 458])

count = array([4, 7, 3, 3, 3, 3, 2])    # [right_head, left_head,...

coord_x = [array([2, 2, 3, 3]),         # right_head
           array([0, 1, 1, 2, 3, 4, 4]),# left_head
           array([0, 1, 4]),            # ...
           array([0, 4, 4]),
           array([0, 1, 2]),
           array([2, 3, 3]),
           array([0, 1]]

coord_y = [array([1, 3, 2, 4],          # right_head 
           array([3, 1, 4, 2, 3, 0, 4], # left_head
           array([4, 2, 2],             # ...
           array([1, 1, 3],
           array([0, 0, 4],
           array([0, 0, 1]),
           array([2, 3]]

It should be way faster than your for loop.

obchardon
  • 10,614
  • 1
  • 17
  • 33