2

I was looking at how to change channels from BGR to RGB, and this came up. This works, but I'm baffled by this syntax. How does this type of data swapping work in numpy exactly?

Code from gist:

rgb = bgr[...,::-1]
feedMe
  • 3,431
  • 2
  • 36
  • 61
dev_nut
  • 2,476
  • 4
  • 29
  • 49

2 Answers2

5

I am no expert on Numpy and what its operations are called, but I can show you how to use various slicing (indexing?) techniques to do some Image Processing.

In general, on RGB images, the operations are separated by commas and look like this:

newImage = oldImage[ROWSTUFF, COLUMNSTUFF, CHANNELSTUFF]

where ROWSTUFF, COLUMNSTUFF and CHANNELSTUFF are each made up of:

start:end:step

So, let's do some processing on this image:

enter image description here

# Load image with PIL/Pillow and make Numpy array - you can equally use OpenCV imread(), or other libraries
im = np.array(Image.open('start.png').convert('RGB'))                                           

# im.shape is (400, 400, 3)

# Now extract top half by ending ROWSTUFF at 200
tophalf = im[:200,:,:]

enter image description here


# Now extract bottom half by starting ROWSTUFF at 200
bottomhalf = im[200:,:,:] 

enter image description here


# Now extract left half by ending ROWSTUFF at 200
lefthalf = im[:,:200,:]

enter image description here


# Now extract right half by starting ROWSTUFF at 200
righthalf = im[:,200:,:]  

enter image description here


# Now scale the image by taking only every 4th row and every second column:
scaled = im[::4,::2,:]

enter image description here


# Now extract Red channel, by setting CHANNELSTUFF to 0
red = im[:,:,0]

enter image description here


# Now extract Green channel, by setting CHANNELSTUFF to 1
green = im[:,:,1] 

enter image description here


# Now flop the image top to bottom by striding backwards through ROWSTUFF
flop = im[::-1,:,:]

enter image description here


# Now flip the image left to right by striding backwards through COLUMNSTUFF
flip = im[:,::-1,:]  

enter image description here


# And finally, like the question, reverse the channels by striding through CHANNELSTUFF backwards, which will make RGB -> BGR, thereby leaving Green and black unchanged
OP = im[:,:,::-1]  

enter image description here


And then just realise that ... is shorthand for "leaving unspecified dimensions as they are", so

[:,:,:,:, a:b:c] can be written as [..., a:b:c]

and

[a:b:c, :,:,:,:,:] can be written as [a:b:c, ...]

Keywords: Image Processing, process, image, Python, Numpy, flip, flop, reverse, stride, start, end, range, slice, slicing, extract, scale, channel, reverse, BGR to RGB, RGB to BGR.

Mark Setchell
  • 191,897
  • 31
  • 273
  • 432
3

The ... are just a placeholder to avoid a syntax error, and the ::-1 means to reverse the elements of the array along the last dimension.

For example:

In [4]: rgb = np.arange(12).reshape(2,2,3)

In [5]: rgb
Out[5]: 
array([[[ 0,  1,  2],
        [ 3,  4,  5]],

       [[ 6,  7,  8],
        [ 9, 10, 11]]])
In [8]: rgb[...,::-1]
Out[8]: 
array([[[ 2,  1,  0],
        [ 5,  4,  3]],

       [[ 8,  7,  6],
        [11, 10,  9]]])
user545424
  • 15,713
  • 11
  • 56
  • 70
  • Do you know any docs that explains this in general? I mean I thought commas were meant to separate out each axis (or dimension). This seems so cryptic. – dev_nut Mar 01 '19 at 22:11
  • Yes the comma separates the two dimensions, that's why it reverses the second dimension. You can find more about the ellipsis here: https://stackoverflow.com/questions/772124/what-does-the-python-ellipsis-object-do, and slicing syntax is documented extensively here: https://docs.python.org/2.3/whatsnew/section-slices.html. – user545424 Mar 01 '19 at 22:28
  • 1
    The example could have equally well been written `rgb = bgr[:,::-1]` which is also equivalent to `rgb = bgr[0:-1,0:-1:-1]`, i.e. take all elements along the first dimension, and all elements along the second dimension with stride = -1 (i.e. reverse it). – user545424 Mar 01 '19 at 22:30