3

I have an image with the shape of 200x250x3. I want to add the zero padding on top, left, right, bottom to the image to achieve a target shape of 256x256x3. How could I do it in tensorflow? I found the function tf.pad, but it needs to compute the padding size, while my task have to compute it automatically

https://www.tensorflow.org/versions/r0.8/api_docs/python/array_ops.html#pad

tf.pad(tensor, paddings, mode='CONSTANT', name=None)
Jame
  • 3,746
  • 6
  • 52
  • 101

1 Answers1

4

For padding images to a target shape, you can use tf.image.resize_image_with_crop_or_pad(). This op crops the image if it is larger than target size, and pads (evenly on all sides) with zeros if it is smaller.

>>> a = tf.ones([3, 4, 3])
>>> tf.image.resize_image_with_crop_or_pad(a, 5, 5)
<tf.Tensor 'Squeeze:0' shape=(5, 5, 3) dtype=float32>

If you want to use padding, you can define a function to calculate padding amount using the difference between desired size and the shape of the tensor (tf.shape()) and pad the difference, check this answer for padding.

umutto
  • 7,460
  • 4
  • 43
  • 53
  • Great. That is my looking for. Could you tell me how can I convert to the result from tensor to image. Because my expected is image – Jame Feb 08 '18 at 01:24
  • @Jame I'm not sure what you mean by converting tensor to image, If you mean saving/displaying the result when the graph is run. It would probably give you a numpy array in shape of an image, which you can save to disk or plot using matplotlib. If you want tensorflow to encode the tensor, I've never used it but [tf.image.encode_png](https://www.tensorflow.org/api_docs/python/tf/image/encode_png) seems to do that. – umutto Feb 08 '18 at 01:31
  • Thanks. I mean my input is RGB image. After using tf function, it returned a tensor type. So my expected output also same type with input that is RGB – Jame Feb 08 '18 at 01:35
  • @Jame Oh, tensorflow will only work with tensors, once the graph is run the results can be received as numpy arrays. Which will have the same shape (height, width, channels). You need to convert that numpy array to RGB image from the library you use. Say, if you are using opencv, it uses numpy as underlying data structure, and you can use `cv2.imwrite()` or `cv2.imshow()` to convert them to human readable format. If you are using pillow (PIL) you can use `img = Image.fromarray()` – umutto Feb 08 '18 at 01:42
  • Thanks. I think I solved it by using tensorflow session `init = tf.initialize_all_variables() sess = tf.Session() sess.run(init)`. Thanks for your help. I have a minor question. After we achieved the target result with padding, could recover original image from the padding image (like opposite question)? – Jame Feb 08 '18 at 01:50
  • @Jame, I'm not aware of any existing reverse op. But you can write a simple function to remove the extra padding. That op pads images centrally with 0's, starting from the bottom or right first ([you can check the source code for its implementation](https://github.com/tensorflow/tensorflow/blob/r1.5/tensorflow/python/ops/image_ops_impl.py#L573)). So `resize_image_with_crop_or_pad([[1, 2, 3], [4, 5, 6]], 4, 4)` will give you `[[0, 0, 0, 0], [1, 2, 3, 0], [4, 5, 6, 0], [0, 0, 0, 0]]`. In your function you can manually remove the two zero rows up and bottom, then remove the right zero column. – umutto Feb 08 '18 at 02:09