Most image representations work with a bitmap with an RGB color space. An image is seen as a rectangle of pixels, and we assign a specific color to every pixel. A color is then represented as 3-tuple: where the first item of the tuple represents the intensity of red, the second one the intensity of green, and the last one the intensity of blue. An important note is that this is a representation of an image: there are other ones. Like for instance using vector graphics. Furthermore there are other color-spaces as well.
This thus means that if we load an image into memory, we obtain a matrix with shape (h, w, 3)
with h
the height of the image (in pixels), and w
the width of the image (again in pixels).
Now numpy allows advanced indexing: we can construct a view by using image[:,:,0]
. This means that we construct a (h, w)
-shaped matrix, where for an item at index [i, j]
, we obtain the value that is placed at [i, j, 0]
in the original image. We thus obtain an image, that only takes the intensity of the red channel into account.
The same holds for image[:,:,1]
and image[:,:,2]
where we take respecively the green and blue channel into account. The representation uses floats where 1.0
means maximum intensity, and 0.0
means lowest intensity. For instance if (red, green, blue) = (1.0, 0.5, 0.0)
, this is a color that most people see as yellow.