import tensorflow as tf
import numpy as np
dim = 1000
x1 = tf.placeholder('float32', shape=(None, dim))
x2 = tf.placeholder('float32', shape=(None, dim))
l2diff = tf.sqrt( tf.reduce_sum(tf.square(tf.sub(x1, x2)),reduction_indices=1))
vector1 = np.random.rand(1,1000)
all_vectors = np.random.rand(500,1000)
sess = tf.Session()
sess.run(tf.global_variables_initializer())
distances = sess.run(l2diff, feed_dict = {x1: vector1, x2: all_vectors})
The above code works well. But iterating for each vector takes too much time.
So is there any way to calculate the same with multiple vectors at a time. Like lets say vector1 = np.random.rand(10,1000)
I am preferring this than sklearn's euclidian distance because I want to calculate similarity for 100k vectors and want to run it on GPU.
And also don't want to replicate the all_vectors because all_vactors already fills 70% my machine's RAM.
Is there any way to calculate distances by passing batch of vectors?