Numpy image slicing returning black patches/ wrong values

Question

The end goal is to take an image and slice it up into samples that I save. The problem is that my slices are randomly returning black/ incorrect patches. Bellow is a small sample program.

import scipy.ndimage as ndimage
import scipy.misc as misc
import numpy as np

image32 = misc.imread("work0.png")
patches = np.zeros((36, 8, 8))
for i in range(4):
  for j in range(4):
    patches[i*4 + j] = image32[i:i+8,j:j+8]
    misc.imsave("{0}{1}.png".format(i,j), patches[i*4 + j])

An example of my image would be:

Patch of 0,0 of 8x8 patch yields:

score 7 · Accepted Answer · edited May 23 '17 at 11:43

Two things:

You are initializing your patch matrix to be the wrong data type. By default, numpy will make patches matrix a np.float64 type and if you use this with saving, you won't get the results you would expect. Specifically, if you consult Mr. F's answer, there is actually some scaling performed on floating-point images where the minimum and maximum values of the image get scaled to black and white respectively and so if you have an image that is completely uniform in background, both the minimum and maximum will be the same and will get visualized to black. As such, the best thing is to respect the original image's data type, namely setting the dtype of your patches matrix to np.uint8.
Judging from your for loop indexing, you want to extract out 8 x 8 patches that are non-overlapping. This means that if you have a 32 x 32 image with 8 x 8 patches, you have 16 patches in total arranged in a 4 x 4 grid.

Therefore, you need to change the patches statement so that it has 16 in the first dimension, not 36. In addition, you'll have to adjust the way you're indexing into your image to extract out the 8 x 8 patches because right now, the patches are overlapping. Specifically, you want to make the image patch indexing go from 8*i to 8*(i+1) for the rows and 8*j to 8*(j+1) for the columns. If you substitute sample values of i and j yourself, you'll see that we get unique 8 x 8 patches for each grid in your image.

With both of the above things I noted, the modified code should be:

import scipy.ndimage as ndimage
import scipy.misc as misc
import numpy as np

image32 = misc.imread('work0.png')

patches = np.zeros((16,8,8), dtype=np.uint8) # Change

for i in range(4):
    for j in range(4):
        patches[i*4 + j] = image32[8*i:8*(i+1),8*j:8*(j+1)] # Change
        misc.imsave("{0}{1}.png".format(i,j), patches[i*4 + j])

When I do this and take a look at the output images, I get what I expect.

To be absolutely sure, let's plot the segments using matplotlib. You've conveniently saved all of the patches in patches so it shouldn't be a problem showing what we need. However, I'll place some code in comments so that you can read in the images that were saved from disk with your above code so you can verify that it still works, regardless of looking at patches or the images on disk:

import matplotlib.pyplot as plt

plt.figure()
for i in range(4):
    for j in range(4):
        plt.subplot(4, 4, 4*i + j + 1)
        img = patches[4*i + j]
        # or you can do this:
        # img = misc.imread('{0}{1}.png'.format(i,j))
        img = np.dstack([img, img, img])
        plt.imshow(img)

plt.show()

The weird thing about matplotlib.pyplot.imshow is that if you have an image that is single channel (such as your case) that has the same intensity all around, it gets visualized to black no matter what the colour map is, much like what we experienced with imsave. Therefore, I had to artificially make this a RGB image but with all of the channels to be the same so this gets visualized as grayscale before we show the image.

We get:

Thx, the indexing was simplified to demonstrate the issue. I'm running this through theano which will eventually be casted to float32,... I don't think this will cause any issues. Anyway thank you for the help! — Dr.Knowitall, Aug 16 '15 at 01:19
@Dr.Knowitall - I've also added in code to show each patch just to verify that what we have is solid. — rayryeng, Aug 16 '15 at 01:26

score 4 · Answer 2 · edited May 23 '17 at 11:58

According to this answer the issue is that imsave normalizes the data so that the computed minimum is defined as black (and, if there is a distinct maximum, that is defined as white).

This led me to go digging as to why the suggested use of uint8 did work to create the desired output. As it turns out, in the source there is a function called bytescale that gets called internally.

Actually, imsave itself is a very thin wrapper around toimage followed by save (from the image object). Inside of toimage if mode is None (which it is by default), that's when bytescale gets invoked.

It turns out that bytescale has an if statement that checks for the uint8 data type, and if the data is in that format, it returns the data unaltered. But if not, then the data is scaled according to a max and min transformation (where 0 and 255 are the default low and high pixel values to compare to).

This is the full snippet of code linked above:

if data.dtype == uint8:
    return data

if high < low:
    raise ValueError("`high` should be larger than `low`.")

if cmin is None:
    cmin = data.min()
if cmax is None:
    cmax = data.max()

cscale = cmax - cmin
if cscale < 0:
    raise ValueError("`cmax` should be larger than `cmin`.")
elif cscale == 0:
    cscale = 1

scale = float(high - low) / cscale
bytedata = (data * 1.0 - cmin) * scale + 0.4999
bytedata[bytedata > high] = high
bytedata[bytedata < 0] = 0
return cast[uint8](bytedata) + cast[uint8](low)

For the blocks of your data that are all 255, cscale will be 0, which will be checked for and changed to 1. Then the line

bytedata = (data * 1.0 - cmin) * scale + 0.4999

will result in the whole image block having the float value of 0.4999, thus set explicitly to 0 in the next chunk of code (when casted to uint8 from float) as for example:

In [102]: np.cast[np.uint8](0.4999)
Out[102]: array(0, dtype=uint8)

You can see in the body of bytescale that there are only two possible ways to return: either your data is type uint8 and it's returned as-is, or else it goes through this kind of silly scaling process. So in the end, it is indeed correct, and good practice, to be using uint8 for the pieces of your code that specifically load from or save to an image format via these functions.

So this cascade of stuff is why you were getting all zeros in the outputted image file and why the other suggestion of using dtype=np.uint8 actually helps you. It's not because you need to avoid floating point data for images, just because of this bizarre convention to check and scale data on the part of imsave.

I just wanted to point out that I didn't say that the OP needed to avoid floating point data... I just said that he/she should probably respect the original input image's type. That being said, great detective work. +1. — rayryeng, Aug 16 '15 at 01:34
Totally, I didn't mean to imply. I added a second ago that one can see from the body of `bytescale` that indeed the *only* way to return from that function without involving the weird scaling is to use `uint8`, so it is probably the best practice on all fronts. — ely, Aug 16 '15 at 01:37
This was some great insight that you provided. I didn't know that `imsave` actually did some scaling on floating point data. I always saved the images with `dtype=np.uint8` to be safe using `imsave`... and it turns out that doing it this way by fluke always gave me the right results lol. Thank you very much. — rayryeng, Aug 16 '15 at 01:38
I wish I could accept your answer as well, this is amazing information! — Dr.Knowitall, Aug 16 '15 at 21:24

Numpy image slicing returning black patches/ wrong values

2 Answers2