When to use .shape and when to use .reshape?

Question

I ran into a memory problem when trying to use .reshape on a numpy array and figured if I could somehow reshape the array in place that would be great.

I realised that I could reshape arrays by simply changing the .shape value. Unfortunately when I tried using .shape I again got a memory error which has me thinking that it doesn't reshape in place.

I was wondering when do I use one when do I use the other?

Any help is appreciated.

If you want additional information please let me know.

EDIT:

I added my code and how the matrix I want to reshape is created in case that is important.

Change the N value depending on your memory.

import numpy as np
N = 100
a = np.random.rand(N, N)
b = np.random.rand(N, N)
c = a[:, np.newaxis, :, np.newaxis] * b[np.newaxis, :, np.newaxis, :]
c = c.reshape([N*N, N*N])
c.shape = ([N, N, N, N])

EDIT2: This is a better representation. Apparently the transpose seems to be important as it changes the arrays from C-contiguous to F-contiguous, and the resulting multiplication in above case is contiguous while in the one below it is not.

import numpy as np
N = 100
a = np.random.rand(N, N).T
b = np.random.rand(N, N).T
c = a[:, np.newaxis, :, np.newaxis] * b[np.newaxis, :, np.newaxis, :]
c = c.reshape([N*N, N*N])
c.shape = ([N, N, N, N])

What are a "memory problem" and a "memory error"? Do you have specific error messages? — John Zwinck, Nov 11 '14 at 02:14
Show us some code with sample inputs (you can use `numpy.random` to generate fake data or whatever, just make it be of a realistic size). — John Zwinck, Nov 11 '14 at 02:16
do you mean realistic to cause a memory problem or to not cause a memory problem? — evan54, Nov 11 '14 at 02:16
When I run your code, it allocates the expected 762 MB array in `rand()` but memory usage doesn't change on the subsequent `reshape` and `shape` lines. What about for you? What version of Python and NumPy are you using? — John Zwinck, Nov 11 '14 at 02:29
sorry had to restart, crashed on using too much memory, also I added some additional code that better resembles my actual code in case that is important. — evan54, Nov 11 '14 at 02:32
Your code still works fine for me. Creation of `c` takes 762 MB but the reshaping afterward does not increase memory usage. — John Zwinck, Nov 11 '14 at 03:12
I have no idea then. The first edit for me indeed doesn't change anything in memory usage the 2nd however does. My .flags for the contiguous are both FALSE for c in the 2nd edit... thanks for the help though — evan54, Nov 11 '14 at 03:23

score 8 · Accepted Answer · answered Nov 11 '14 at 02:23

8

numpy.reshape will copy the data if it can't make a proper view, whereas setting the shape will raise an error instead of copying the data.

It is not always possible to change the shape of an array without copying the data. If you want an error to be raise if the data is copied, you should assign the new shape to the shape attribute of the array.

answered Nov 11 '14 at 02:23

ryanpattison

6,151
1
21
28

So the behaviour is exactly the same except how they handle the need for a copy? Also when would it need to be copied? – evan54 Nov 11 '14 at 02:28
@evan54 If the array is not contiguous it cannot be reshaped in-place, see the comments in the answer to [reshape an array in numpy](http://stackoverflow.com/questions/14476415/reshape-an-array-in-numpy) – ryanpattison Nov 11 '14 at 02:37

score 0 · Answer 2 · answered May 18 '22 at 13:46

I would like to revisit this question focusing on OOP paradigm, despite memory issues presented as the problem.

When to use .shape and when to use .reshape?

OOP principle of Encapsulation

Following OOP paradigms, since shape is a property of the object numpy.array it is always advisable to call an object.method to change properties. This adheres to OOP principle of encapsulation.

Performance Issues

As for performance, there seems to be no difference.

import numpy as np
# creates an array of 1,000,000 random floats
a = np.array(np.random.rand(1_000_000))

# (1000000,)
a.shape                   

# using IPython to time both operations resulted in

# 201 ns ± 4.85 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
%timeit a.shape = (5_000, 200)

# 217 ns ± 0.957 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
%timeit a.reshape (5_000, 200)

Running hardware

OS : Linux 4.15.0-142-generic #146~16.04.1-Ubuntu CPU: Intel(R) Core(TM) i3-4170 CPU @ 3.70GHz 4 cores RAM: 16BG

When to use .shape and when to use .reshape?

2 Answers2

OOP principle of Encapsulation

Performance Issues

Running hardware

Linked