0

I would like to perform spacial pyramid pooling in tensorflow. This has already been answered there (and other questions in Stackoverflow.com), but the proposed solution doesn't work with unknown input shape.

Is there an implementation that handles unknown shapes at graph definition?

Jav
  • 1,445
  • 1
  • 18
  • 47
  • Share the code which you are using for SPP. Consider adding more details regarding your problem. – Shubham Panchal Apr 22 '20 at 02:36
  • Thank you for your comment. The code I use is similar to the code given in the link. However, I made another implementation that solves the problem. I'm gonna write it as an answer to the question. If my question is unclear, please share how I could improve it for better visibility. Thanks. – Jav Apr 22 '20 at 20:32

1 Answers1

0

To address this issue, I came up with a different implementation that uses a mask, rescaled using nearest neighbor:

def avg_spp(self, input, scale, name, padding=DEFAULT_PADDING):
    eye = tf.eye(scale*scale, batch_shape=(tf.shape(input)[0],))
    mask = tf.reshape(eye, (-1, scale, scale, scale*scale))
    mask = tf.image.resize_nearest_neighbor(mask, tf.shape(input)[1:3])
    spp = tf.multiply(tf.expand_dims(input, 4), tf.expand_dims(mask, 3))
    spp = tf.divide(tf.reduce_sum(spp, axis=[1,2]), tf.cast(tf.count_nonzero(spp, axis=[1,2]), tf.float32))
    spp = tf.reshape(spp, (-1, tf.shape(input)[3], scale, scale))
    spp = tf.transpose(spp, [0,2,3,1], name=name)
    return spp
Jav
  • 1,445
  • 1
  • 18
  • 47