2

So for some ml work I am doing, I need to be able to generate random images and add them to existing images. For testing, I am generating these random images and multiplying them by zero, then adding them to my existing images. I would expect to receive a new image which is identical to the original image, yet I receive a more blue-ish version of the original image:

Original v. Generated

I have been racking my head against this for hours now, and cannot seem to arrive to a conclusion as to what is causing this discrepancy. Here is the relevant code:

# unit_img is an ndarray with random entries, then normalized 
# so that the sum of the squares of all the elements is 1

# min_dist is our scalar we multiply our unit image by, since 
# it's zero we don't care about the unit image

min_dist = 0
...
unit_img = np.load(path_to_unit_img)
unit_img = min_dist * unit_img

# check if our unit img and our original image are the same size
if unit_img.size != checked_img.size:
    continue
# "move" our new image to the solution space of the original img
addition = unit_img + checked_img

result_img = Image.fromarray(addition.astype('uint8')).convert('RGB')

# now we save our generated image
result_img.save(save_path + extension + img[:-4] + "_" + str(x) + ".jpg")

For full disclosure, I am iterating over a couple thousand images, and unit_img is different for each image. Running a simple test program that loads both the cat images shown above and prints them out, I see:

# original image

[[[164 159 160]
  [164 159 160]
  [164 159 160]
  ...
  [152 152 152]
  [152 152 152]
  [152 152 152]]

 [[165 160 161]
  [165 160 161]
  [165 160 161]
  ...
  [152 152 152]
  [152 152 152]
  [152 152 152]]

 [[162 160 160]
  [162 160 160]
  [162 160 160]
  ...
  [152 152 152]
  [152 152 152]
  [152 152 152]]

 ...

 [[151 143 136]
  [151 143 136]
  [151 143 136]
  ...
  [ 81  81  81]
  [ 83  83  83]
  [ 85  85  85]]

 [[152 144 137]
  [152 144 137]
  [152 144 137]
  ...
  [ 86  86  86]
  [ 83  83  83]
  [ 82  82  82]]

 [[152 144 137]
  [152 144 137]
  [152 144 137]
  ...
  [ 89  89  89]
  [ 82  82  82]
  [ 78  78  78]]]
===========================================
# Resultant from adding an array of zeros

[[[160 159 163]
  [160 159 163]
  [160 159 163]
  ...
  [152 152 152]
  [152 152 152]
  [152 152 152]]

 [[161 160 164]
  [161 160 164]
  [161 160 164]
  ...
  [152 152 152]
  [152 152 152]
  [152 152 152]]

 [[160 159 161]
  [160 159 161]
  [160 159 161]
  ...
  [152 152 152]
  [152 152 152]
  [152 152 152]]

 ...

 [[137 143 150]
  [137 143 150]
  [137 143 150]
  ...
  [ 80  80  80]
  [ 83  83  83]
  [ 87  87  87]]

 [[138 144 151]
  [138 144 151]
  [138 144 151]
  ...
  [ 86  86  86]
  [ 83  83  83]
  [ 82  82  82]]

 [[138 144 151]
  [138 144 151]
  [138 144 151]
  ...
  [ 89  89  89]
  [ 82  82  82]
  [ 77  77  77]]]

So obviously, the images are numerically similar, but along the way some axes became inverted. What I have tried so far is to check if my unit_img post

unit_img = min_dist * unit_img

is really zero by doing np.count_nonzero, and it always has 0 non-zero elements, meaning I really am adding an ndarray full of zeros. This means I am saving the image incorrectly somehow, or maybe as a wrong datatype. Any help at all would be appreciated!

Sid Devic
  • 53
  • 4
  • Have you checked that both arrays you are adding together contain data of the same type? Another possibility is that the array of zeros has different dimensions (likely 3rd/color) than your original image. With the uint8 addition and then color space conversion something may get lost in translation. Better to do those steps separately and check each. – Jello Mar 19 '18 at 17:36
  • What is `type(checked_img)`? (Show the exact result of that function call.) If it is a numpy array, what is `checked_img.dtype`? – Warren Weckesser Mar 19 '18 at 17:53
  • You are saving the result as a JPEG file. That is a lossy format, so just saving it in a new file can change it. Try using a lossless format, such as PNG. – Warren Weckesser Mar 19 '18 at 17:54
  • I will look into switching to PNG thanks! But specifically for the axis inversion it turns out that cv2.imread was reading into a BGR instead of RGB np array. Thanks to all! – Sid Devic Mar 19 '18 at 19:45

2 Answers2

0

How are you opening the image? Could it be that it doesn't start out in RGB format, and your conversion is messing with it? If you are using openCV, the image may start out in BGR format

Also, you say if unit_img.size != checked_img.size: is to check whether the images equal in size, but this actually checks to see if they don't equal. And either way, the rest of the code will run because it is unindented, so it is not part of the logic

Community
  • 1
  • 1
Novice
  • 855
  • 8
  • 17
0

I just had to add:

img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

After anytime I loaded an image using cv2.imread(). This is because imread reads the file as BGR instead of RGB, so the axes would be inverted. Thanks for all the help!

Sid Devic
  • 53
  • 4