1

I want to overlay a smaller image onto a larger one.

I have tried adding to a slice but couldn't get it to work.

So, as a simple example, how do I perform this NumPy operation in Tensorflow:

a = np.array([1, 1, 1, 1])
b = np.array([5, 5])
c = a
c[1:3] = c[1:3] + b
print(c)
# => [1 6 6 1]
Pumpkin
  • 233
  • 1
  • 9
  • This is an ongoing question with TensorFlow. I have given answers to similar cases [here](https://stackoverflow.com/q/53144166), [here](https://stackoverflow.com/q/49755316), [here](https://stackoverflow.com/q/49487647) and [here](https://stackoverflow.com/q/49493444) (and opened an issue about the need of better support for this [here](https://github.com/tensorflow/tensorflow/issues/18383)). Check out if any of the answers there help you and then you can mark this as duplicate. If not, I can give a specific answer to your case here. – jdehesa Nov 16 '18 at 11:33
  • Also, I assume your actual case would be for a two-dimensional tensor? I am saying because, while in NumPy it is pretty much the same, in TensorFlow it would make a bigger difference. And would it be overlaying a single image into another, or a single image into a batch, or one batch of images into another batch of images...? And is each image a 2D tensor or 3D, with RGB channels? – jdehesa Nov 16 '18 at 11:35
  • I added a possible implementation. Actually since you are not straight out replacing, but adding to what is already there, can be done more easily just with padding. – jdehesa Nov 16 '18 at 15:20

1 Answers1

3

This is one possible implementation:

import tensorflow as tf

# i and j are first row and colum
# alpha (0..1) is the intensity of the overlay
def overlay_patch(img, patch, i, j, alpha=0.5):
    img_shape = tf.shape(img)
    img_rows, img_cols = img_shape[0], img_shape[1]
    patch_shape = tf.shape(patch)
    patch_rows, patch_cols = patch_shape[0], patch_shape[1]
    i_end = i + patch_rows
    j_end = j + patch_cols
    # Mix patch: alpha from patch, minus alpha from image
    overlay = alpha * (patch - img[i:i_end, j:j_end])
    # Pad patch
    overlay_pad = tf.pad(overlay, [[i, img_rows - i_end], [j, img_cols - j_end], [0, 0]])
    # Make final image
    img_overlay = img + overlay_pad
    return img_overlay

Test:

img = tf.placeholder(tf.float32, [None, None, None])
patch = tf.placeholder(tf.float32, [None, None, None])
i = tf.placeholder(tf.int32, [])
j = tf.placeholder(tf.int32, [])
alpha = tf.placeholder(tf.float32, [])
img_overlay = overlay_patch(img, patch, i, j, alpha)
with tf.Session() as sess:
    result = sess.run(img_overlay, feed_dict={
        img: [[[ 1], [ 2], [ 3], [ 4]],
              [[ 5], [ 6], [ 7], [ 8]],
              [[ 9], [10], [11], [12]],
              [[13], [14], [15], [16]]],
        patch: [[[10], [20], [30]],
                [[40], [50], [60]]],
        i: 2, j: 1, alpha: 0.5
    })
    print(result[..., 0])

Output:

[[ 1.   2.   3.   4. ]
 [ 5.   6.   7.   8. ]
 [ 9.  10.  15.5 21. ]
 [13.  27.  32.5 38. ]]
jdehesa
  • 58,456
  • 7
  • 77
  • 121
  • 1
    @Pumpkin I added the 3rd dimension thing, and an `alpha` parameter to control the blend (0 is full original image, 1 is full patch). I still try to avoid the making the mask, I think just multiplication and addition should be faster. Here I "pre-subtract" the image from the patch before adding it so the final blending is correct. – jdehesa Nov 16 '18 at 15:53
  • @Pumpkin (I'm thinking this trick can actually be always used for slice replacement with `alpha=1`, at least with signed data types) – jdehesa Nov 16 '18 at 15:57