3

I was wondering why tf.nn.embedding_lookup uses a list of tensors whereas tf.gather just performs a lookup on a single tensor. Why would I ever need to do the lookup on multiple embeddings?

I think I read somewhere that it is useful for saving memory on large embeddings, but I am not sure how this would work since I don't see how splitting up the embedding would save anything.

nbro
  • 15,395
  • 32
  • 113
  • 196
chasep255
  • 11,745
  • 8
  • 58
  • 115
  • Possible duplicate of [What does tf.nn.embedding\_lookup function do?](http://stackoverflow.com/questions/34870614/what-does-tf-nn-embedding-lookup-function-do) – valentin Mar 23 '17 at 09:02

1 Answers1

6

tf.embedding_lookup function assumes that the embedding matrix is sharded, i.e., partitioned into many pieces. Indeed, it can work when the embedding matrix is sharded one-way, in which case it acts like tf.gather.

But the more interesting case is when the embedding matrix is large and you can't fit it on one machine's memory, or, you want a high bandwidth on the embedding lookup operation. In those cases, it helps to partition the matrix into pieces. The pieces can be distributed across machines to fit it all in memory, and also allow parallel reads for higher bandwidth for the lookup.

keveman
  • 8,427
  • 1
  • 38
  • 46