A solution to a similar problem in an unrelated circumstance is using a placeholder for a fixed length of maximum size. Suppose I have a sequence of length 40. t = tf.range(40)
.
Now at run time, I get a split, (say x = [6,9,10,5,1]
). Now follow the steps
Step 1:
Determine the maximum number of splits there can be, say: 19
Step 2:
num_splits = tf.placeholder(tf.int32, [19])
y= tf.split(t, num_or_size_splits=num_splits, axis=0)
This will break the sequence into run-time determined split sizes
Step 4:
At run time :
x = [6,9,10,5,1]
x += [40-sum(x)] + [0 for i in range(19-1-len(x))]
First line means the actual split sizes we need
Split requires the split sizes should sum upto total split size, i.e. 40 in this case, and 0 are the split sizes for the left over splits.
session.run(y, feed_dict={num_splits:x})
will show the results like :
[0, 1, 2, 3, 4, 5]
[ 6, 7, 8, 9, 10, 11, 12, 13, 14]
[15, 16, 17, 18, 19, 20, 21, 22, 23, 24]
[25, 26, 27, 28, 29]
[30]
[31, 32, 33, 34, 35, 36, 37, 38, 39]
[]
.
.
[]
Step 5:
(Optional, preferred) pad with zeros till the max length of the sequence
def pad_up_to(t, max_in_dims, constant_values):
s = tf.shape(t)
paddings = [[0, m-s[i]] for (i,m) in enumerate(max_in_dims)]
return tf.pad(t, paddings, 'CONSTANT', constant_values=constant_values)
m = []
for t1 in y :
t1=tf.reshape(t1,[1,-1])
t_padded = pad_up_to(t1, [1,15], 0)
session.run(t_padded , feed_dict={num_splits:x})
m+= [t_padded]
m= tf.concat(m,0)
This will pad the chunks with zeros to create equal sized chunks.
NOTE: The above methodology helped me in converting sequences into sentences (variable number of sentences) for NLP related tasks
funciton : pad_up_to() is taken from
Q:42334646