1

In v4 of their API, torch has introduced reshape(), to align more with the style of numpy. Previously, changing the shape of a torch tensor was done with view().

I wondered whether view() was going to be deprecated now and looked at the docs. Turns out that reshape() is not just a numpy-friendly alias for view(), actually, it has a different semantic. Torch tries to give you contiguous memory where possible. If the new view dimensions violate a contiguity constraint, you have to explicitly call contiguous() before view(). Reshape will work even if this constraint is violated, but will silently make a copy of the data. This is the same behaviour as in numpy, where reshape can produce copies, too.

A question on view() vs reshape() in torch is here: What's the difference between reshape and view in pytorch?

if you need a copy use clone() if you need the same storage use view(). The semantics of reshape() are that it may or may not share the storage and you don't know beforehand.

Up until now, torch only offered view(). Maybe to intentionally force their developers to care about memory layout. Which makes me wonder how reshape() works in tensorflow.

In torch, the distinction between a view and a copy could produce complicated bugs. You assume that tensors share data, but they don't. In tensorflow this problem shouldn't exist. A tensorflow tensor is symbolic and doesn't hold a value. A reshape is just an Op in the tensorflow graph. While the graph is evaluated, data in placeholders and variables don't change, so it is clear what data you are working on.

But I don't know if this could hurt performance. Copying a huge tensor can be very expensive. Do I have to be careful when using reshape(), not to duplicate memory ?

lhk
  • 27,458
  • 30
  • 122
  • 201
  • 1
    `Tensor`s in Tensorflow are always contiguous. `reshape`-ing contiguous array always result in contiguous array. Therefore, `reshape` in Tensorflow **never have to** create copy (not sure if it actually does it or not). On the contrary, operations like transpose **always** makes copy. – ZisIsNotZis Nov 21 '18 at 04:25
  • But since Tensorflow is a "computation graph compiler", it do have some automatic optimization in the graph that might optimize out the copying. I did some google but did't find detail of that. – ZisIsNotZis Nov 21 '18 at 04:31
  • @ZisIsNotZis "Tensors in tensorflow are always contiguous", I don't understand this. If you slice a tensor with a stride [::2], the result will not be contiguous. And such a slice is definitely possible in tensorflow. Does tensorflow implicitly call contiguous() after every op that breaks this invariant? That would not be efficient. I guess this only applies when running a session and evaluating a tensor. But still, how do you keep it contiguous? – lhk Nov 21 '18 at 08:27
  • 1
    Actually I read about it [here][https://www.tensorflow.org/api_docs/python/tf/transpose ]. At the bottom of the page, it sais Tensorflow do not support strides. Therefore, it have to be contiguous. If you look at [here][https://stackoverflow.com/questions/50779869/does-tensorflow-tf-slice-incur-allocation-and-or-memory-copy ], the answer also sais the tensorflow always make copy unless dim-0-aligned. – ZisIsNotZis Nov 21 '18 at 08:35
  • But this shouldn't be a that big problem since if you are using Tensorflow, you are usually doing much slower jobs like matrix multiplication and etc. In this sense, copying is the fastest operation you can do with some data, which takes relatively negligible time. – ZisIsNotZis Nov 21 '18 at 08:46
  • Huh? Strided slices are not possible? Look here : https://www.tensorflow.org/api_docs/python/tf/strided_slice – lhk Nov 21 '18 at 09:08
  • By "Tensorflow do not support strides" he means that tensorflow does not support "strided tensor" (i.e. non-contiguous tensor). Strided slicing is supported and will result in a copy, so that tensors are always contiguous. – ppwwyyxx Nov 21 '18 at 19:11

0 Answers0