224

In numpy, we use ndarray.reshape() for reshaping an array.

I noticed that in pytorch, people use torch.view(...) for the same purpose, but at the same time, there is also a torch.reshape(...) existing.

So I am wondering what the differences are between them and when I should use either of them?

Mateen Ulhaq
  • 24,552
  • 19
  • 101
  • 135
Lifu Huang
  • 11,930
  • 14
  • 55
  • 77

5 Answers5

230

torch.view has existed for a long time. It will return a tensor with the new shape. The returned tensor will share the underling data with the original tensor. See the documentation here.

On the other hand, it seems that torch.reshape has been introduced recently in version 0.4. According to the document, this method will

Returns a tensor with the same data and number of elements as input, but with the specified shape. When possible, the returned tensor will be a view of input. Otherwise, it will be a copy. Contiguous inputs and inputs with compatible strides can be reshaped without copying, but you should not depend on the copying vs. viewing behavior.

It means that torch.reshape may return a copy or a view of the original tensor. You can not count on that to return a view or a copy. According to the developer:

if you need a copy use clone() if you need the same storage use view(). The semantics of reshape() are that it may or may not share the storage and you don't know beforehand.

Another difference is that reshape() can operate on both contiguous and non-contiguous tensor while view() can only operate on contiguous tensor. Also see here about the meaning of contiguous.

jdhao
  • 24,001
  • 18
  • 134
  • 273
  • 68
    Maybe emphasizing that torch.view can only operate on contiguous tensors, while torch.reshape can operate on both might be helpful too. – p13rr0m Apr 05 '18 at 09:10
  • 8
    @pierrom contiguous here referring to tensors that are stored in contiguous memory or something else? – gokul_uf Dec 04 '18 at 15:34
  • 5
    @gokul_uf Yes, you can take a look at the answer written here: https://stackoverflow.com/questions/48915810/pytorch-contiguous – MBT Dec 05 '18 at 11:12
  • does the phrase "a view of a tensor" mean in pytorch? – Charlie Parker Jun 29 '20 at 21:37
  • It will be helpful to have an explanation on what is "compatible strides". Thanks! – bruin Dec 18 '20 at 06:08
  • when does one use `.view` vs `.shape`? – Charlie Parker Jul 15 '21 at 22:39
  • 1
    view() _can_ operate on non-contiguous tensors. See my answer for an example. – Pierre Oct 20 '21 at 18:47
  • @pierrom No, I think MBT's comment can be a bit confusing. the expression `contiguous` here does **not** mean whether the data is stored in contiguous memory blocks or not. Even if a pytorch tensor is not "contiguous", the elements are arranged in contiguous memory blocks. The expression "contiguous" here is related to the order of the elements when pytorch see the tensor. – starriet Feb 20 '22 at 09:15
  • This should be corrected: *"The semantics of reshape() are that it may or may not share the storage and you don't know beforehand."* => you are able to **Know** beforehand. It's *not* a random operation between sharing or not sharing the storage. – starriet Apr 29 '22 at 02:10
  • This answer is clear and straight to the point, but it is too bad it can easily be missed because of other long answers shown above it. – SomethingSomething Jul 28 '22 at 10:28
93

Although both torch.view and torch.reshape are used to reshape tensors, here are the differences between them.

  1. As the name suggests, torch.view merely creates a view of the original tensor. The new tensor will always share its data with the original tensor. This means that if you change the original tensor, the reshaped tensor will change and vice versa.
>>> z = torch.zeros(3, 2)
>>> x = z.view(2, 3)
>>> z.fill_(1)
>>> x
tensor([[1., 1., 1.],
        [1., 1., 1.]])
  1. To ensure that the new tensor always shares its data with the original, torch.view imposes some contiguity constraints on the shapes of the two tensors [docs]. More often than not this is not a concern, but sometimes torch.view throws an error even if the shapes of the two tensors are compatible. Here's a famous counter-example.
>>> z = torch.zeros(3, 2)
>>> y = z.t()
>>> y.size()
torch.Size([2, 3])
>>> y.view(6)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: invalid argument 2: view size is not compatible with input tensor's
size and stride (at least one dimension spans across two contiguous subspaces).
Call .contiguous() before .view().
  1. torch.reshape doesn't impose any contiguity constraints, but also doesn't guarantee data sharing. The new tensor may be a view of the original tensor, or it may be a new tensor altogether.
>>> z = torch.zeros(3, 2)
>>> y = z.reshape(6)
>>> x = z.t().reshape(6)
>>> z.fill_(1)
tensor([[1., 1.],
        [1., 1.],
        [1., 1.]])
>>> y
tensor([1., 1., 1., 1., 1., 1.])
>>> x
tensor([0., 0., 0., 0., 0., 0.])

TL;DR:
If you just want to reshape tensors, use torch.reshape. If you're also concerned about memory usage and want to ensure that the two tensors share the same data, use torch.view.

nikhilweee
  • 3,953
  • 1
  • 18
  • 13
  • 3
    Maybe it's just me, but I was confused into thinking that contiguity is the deciding factor between when reshape does and does not share data. From my own experiments, it seems that this is not the case. (Your `x` and `y` above are both contiguous). Perhaps this can be clarified? Perhaps a comment on _when_ reshape does and does not copy would be helpful? – RMurphy Mar 18 '20 at 16:09
  • `x` and `y` are contiguous but we care here about `z` and `z.t()`. `z`is contiguous so `y` and `z` share the same data. `z` and `z.t()` share the same data, but `z.t()` is not contiguous so `z.t()` and `x` do not share the same data. Therefore `x` and `y` do not share the same data. – Will May 28 '23 at 07:52
31

view() will try to change the shape of the tensor while keeping the underlying data allocation the same, thus data will be shared between the two tensors. reshape() will create a new underlying memory allocation if necessary.

Let's create a tensor:

a = torch.arange(8).reshape(2, 4)

initial 2D tensor

The memory is allocated like below (it is C contiguous i.e. the rows are stored next to each other):

initial 2D tensor's memory allocation

stride() gives the number of bytes required to go to the next element in each dimension:

a.stride()
(4, 1)

We want its shape to become (4, 2), we can use view:

a.view(4,2)

after view to switch the dimensions

The underlying data allocation has not changed, the tensor is still C contiguous:

memory allocation after switch

a.view(4, 2).stride()
(2, 1)

Let's try with a.t(). Transpose() doesn't modify the underlying memory allocation and therefore a.t() is not contiguous.

a.t().is_contiguous()
False

after transpose

memory allocation after transpose

Although it is not contiguous, the stride information is sufficient to iterate over the tensor

a.t().stride()
(1, 4)

view() doesn't work anymore:

a.t().view(2, 4)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.

Below is the shape we wanted to obtain by using view(2, 4):

after transpose and reshape

What would the memory allocation look like?

memory allocation without reshape

The stride would be something like (4, 2) but we would have to go back to the begining of the tensor after we reach the end. It doesn't work.

In this case, reshape() would create a new tensor with a different memory allocation to make the transpose contiguous:

memory allocation with reshape or contiguous

Note that we can use view to split the first dimension of the transpose. Unlike what is said in the accepted and other answers, view() can operate on non-contiguous tensors!

a.t().view(2, 2, 2)

after transpose and view 2, 2, 2

memory allocation after transpose

a.t().view(2, 2, 2).stride()
(2, 1, 4)

According to the documentation:

For a tensor to be viewed, the new view size must be compatible with its original size and stride, i.e., each new view dimension must either be a subspace of an original dimension, or only span across original dimensions d, d+1, …, d+k that satisfy the following contiguity-like condition that ∀i=d,…,d+k−1,
stride[i]=stride[i+1]×size[i+1]

Here that's because the first two dimensions after applying view(2, 2, 2) are subspaces of the transpose's first dimension.

For more information about contiguity have a look at my answer in this thread

Pierre
  • 1,182
  • 11
  • 15
  • 2
    The illustration and its color darkness help me understand what `contiguous` means, it means whether indexing all of next number in one row is contiguous or not. BTW, there is a minor typo at `b.t().is_contiguous()`, it might be `a.t().is_contiguous()`, thanks all the same ! – Wade Wang Nov 06 '21 at 09:38
  • Thanks for your comment and for catching the typo! It's now fixed. – Pierre Nov 06 '21 at 11:27
18

Tensor.reshape() is more robust. It will work on any tensor, while Tensor.view() works only on tensor t where t.is_contiguous()==True.

To explain about non-contiguous and contiguous is another story, but you can always make the tensor t contiguous if you call t.contiguous() and then you can call view() without the error.

prosti
  • 42,291
  • 14
  • 186
  • 151
0

I would say the answers here are technically correct but there's another reason for existing of reshape. pytorch is usually considered more convenient than other frameworks because it closer to python and numpy. It's interesting that the question involves numpy.

Let's look into size and shape in pytorch. size is a function so you call it like x.size(). shape in pytorch is not a function. In numpy you have shape and it's not a function - you use it x.shape. So it's handy to get both of them in pytorch. If you came from numpy it would be nice to use the same functions.

irudyak
  • 2,271
  • 25
  • 20