24

In tensorflow, the functions tf.einsum, tf.matmul, and tf.tensordot can all be used for the same tasks. (I realize that tf.einsum and tf.tensordot have more general definitions; I also realize that tf.matmul has batch functionality.) In a situation where any of the three could be used, does one function tend to be fastest? Are there other recommendation rules?

For example, suppose that A is a rank-2 tensor, and b is rank-1 tensor, and you want to compute the product c_j = A_ij b_j. Of the three options:

c = tf.einsum('ij,j->i', A, b)

c = tf.matmul(A, tf.expand_dims(b,1))

c = tf.tensordot(A, b, 1)

Is any generally preferable to the others?

Innat
  • 16,113
  • 6
  • 53
  • 101
John Kleve
  • 499
  • 1
  • 4
  • 12

1 Answers1

19

Both tf.tensordot() and tf.einsum() are syntactic sugar that wrap one or more invocations of tf.matmul() (although in some special cases tf.einsum() can reduce to the simpler elementwise tf.multiply()).

In the limit, I'd expect all three functions to have equivalent performance for the same computation. However, for smaller matrices, it may be more efficient to use tf.matmul() directly, because it would yield a simpler TensorFlow graph with fewer operations, and hence the pre-operation invocation costs will be lower.

Innat
  • 16,113
  • 6
  • 53
  • 101
mrry
  • 125,488
  • 26
  • 399
  • 400
  • In my example, I have to use `tf.expand_dims` on `b` before applying `tf.matmul`. In addition, using `tf.matmul` returns a rank-2 tensor rather than a rank-1 tensor; making `c` be a rank-1 tensor requires calling `tf.squeeze` after the matrix multiplication. Do the `squeeze` and `expand_dims` operations have a meaningful time cost? – John Kleve Mar 29 '17 at 21:47
  • 2
    They're purely metadata operations, so they have a very small constant cost, which should be dominated by the `tf.matmul()` itself. – mrry Mar 29 '17 at 23:57
  • Are you sure about this? Fundamentally, the set of possible operations with einsum is a massive superset of what is possible with matmul; which always contracts over a single axis. In numpy einsum there can be substantial performance differences – Eelco Hoogendoorn Apr 06 '20 at 17:59