2

Given a tensor whose shape is Nx2, how is it possible to select k elements from this tensor akin to np.random.choice (with equal probability) ? Another point to note is that the value of N dynamically changes during execution. Meaning to say that I'm dealing with a dynamically-sized tensor.

HuckleberryFinn
  • 1,489
  • 2
  • 16
  • 26
  • 1
    Possible duplicate of [Sampling without replacement from a given non-uniform distribution in TensorFlow](https://stackoverflow.com/questions/43310075/sampling-without-replacement-from-a-given-non-uniform-distribution-in-tensorflow) – pfm Mar 08 '18 at 16:14

2 Answers2

0

You can just wrap np.random.choice as a tf.py_func. See for example this answer. In your case, you need to flatten your tensor so it is an array of length 2*N:

import numpy as np
import tensorflow as tf

a = tf.placeholder(tf.float32, shape=[None, 2]) 
size = tf.placeholder(tf.int32)
y = tf.py_func(lambda x, s: np.random.choice(x.reshape(-1),s), [a, size], tf.float32)
with tf.Session() as sess:
    print(sess.run(y, {a: np.random.rand(4,2), size:5}))
pfm
  • 6,210
  • 4
  • 39
  • 44
  • When I try to run this, I get a `NameError: name 'size' is not defined` error from this line `print(sess.run(y, {a: np.random.rand(4,2), size:5}))`. Any idea the issue? – SantoshGupta7 Dec 27 '18 at 06:11
  • 1
    Using `tf.py_func` is terrible since calling python operation slows computation down extremely. Especially if you are using GPU, then utilization can drop from 100% to 5% depending on a task. You can use `tf.multinomial` however that samples with replacement. To sample without replacement, you are probably out of luck because I think of no new op that can do that -- the best is to write custom c++ op. You can however hack it for example by iteratively sampling one element from an array, and removing that element by something like `tf.gather`. – user2781994 Feb 20 '19 at 19:21
0

I had a similar problem, where I wanted to subsample points from a pointcloud for an implementation of PointNet. My input dimension was [None, 2048, 3], and I was subsampling down to [None, 1024, 3] using the following custom layer:

class SubSample(Layer):
  def __init__(self,num_samples):
    super(SubSample, self).__init__()
    self.num_samples=num_samples

  def build(self, input_shape):
    self.shape = input_shape #[None,2048,3]

  def call(self, inputs, training=None):
    k = tf.random.uniform([self.shape[1],]) #[2048,]
    bl = tf.argsort(k)<self.num_samples #[2048,]
    res = tf.boolean_mask(inputs, bl, axis=1) #[None,1024,3]
    # Reshape needed so that channel shape is passed when `run_eagerly=False`, otherwise it returns `None`
    return tf.reshape(res,(-1,self.num_samples,self.shape[-1])) #[None,1024,3]

SubSample(1024)(tf.random.uniform((64,2048,3))).shape

>>> TensorShape([64, 1024, 3])

As far as I can tell, this works for TensorFlow 2.5.0

Note that this isn't directly an answer to the question at hand, but the answer that I was looking for when I stumbled across this question.

Matt Raymond
  • 71
  • 3
  • 5