I am using torch with some semantic segmentation algorithms to produce a binary mask of the segmented images. I would then like to crop the images based on that mask. To be clear I need to crop it on a per pixel basis. It seems like a simple problem but the only solution I can conjure up is to either invert a draw mask
function like in the Coco API, or iterate over each pixel in the array and mask together setting the pixel to black if not needed. I feel like there is a better way of doing this. Libraries in Lua, Python, Go, or C++ will work for me. Any ideas?

- 393
- 1
- 4
- 13
-
use findContours or extract all mask points (manually) and use the minBoundingRect function. Afterwards use subimage to get the cropped image. – Micka Nov 27 '16 at 09:44
6 Answers
I've implemented this in Python, assuming that you have your input image and mask available as Mat Objects. Given that src1 is your image and src1_mask is your binary mask:
src1_mask=cv2.cvtColor(src1_mask,cv2.COLOR_GRAY2BGR)#change mask to a 3 channel image
mask_out=cv2.subtract(src1_mask,src1)
mask_out=cv2.subtract(src1_mask,mask_out)
Now mask_out contains the part of the image src1 located inside the binary mask you defined.

- 2,467
- 5
- 15
- 39
Here is a solution relying only on numpy:
def get_segment_crop(img,tol=0, mask=None):
if mask is None:
mask = img > tol
return img[np.ix_(mask.any(1), mask.any(0))]
now execute get_segment_crop(rgb, mask=segment_mask)
where rgb
is an ndarray of shape (w,h,c) and segment_mask
is a boolean ndarray (i.e. containing True/False entries) of shape (w,h), given that w=width, h=height.

- 10,500
- 6
- 27
- 47
For anyone else running into this. I found good luck with converting the torch binary mask tensor into type Double
, and then simply multiplying it using torch's cmul
function against each of the RGB channels. Basically, because the binary mask has a 1
in place of a segmented pixel, then the value will just remain. Whereas if it is outside the segmentation it has a 0
which when multiplied across the channels produces black. Saransh's answer is also good, and works well for open cv.

- 393
- 1
- 4
- 13
Use OpenCV .copyTo with the mask option
http://docs.opencv.org/2.4/modules/core/doc/basic_structures.html#mat-copyto

- 316
- 1
- 4
- 6
mask contain patches in white colour on black background
src1=cv2.imread('image.png',0)
mask=cv2.imread('label.png',0)
ret, thresh1 = cv2.threshold(mask, 0, 255, cv2.THRESH_BINARY)
src1 [thresh1==0] = 0

- 121
- 1
- 4
You can use the boundingRect
function from opencv to retrieve the rectangle of interest, and you can crop the image to that rectangle. A python implementation would look something like this:
import numpy as np
import cv2
mask = np.zeros([600,600], dtype=np.uint8)
mask[200:500,200:500] = 255 # set some values to 255 to represent an actual mask
rect = cv2.boundingRect(mask) # function that computes the rectangle of interest
print(rect)
img = np.ones([600,600, 3], dtype=np.uint8) # arbitrary image
cropped_img = img[rect[1]:(rect[1]+rect[3]), rect[0]:(rect[0]+rect[2])] # crop the image to the desired rectangle
substitute mask
an img
with your own

- 2,435
- 3
- 19
- 13