Memory growth with broadcast operations in NumPy

Question

I am using NumPy to handle some large data matrices (of around ~50GB in size). The machine where I am running this code has 128GB of RAM so doing simple linear operations of this magnitude shouldn't be a problem memory-wise.

However, I am witnessing a huge memory growth (to more than 100GB) when computing the following code in Python:

import numpy as np

# memory allocations (everything works fine)
a = np.zeros((1192953, 192, 32), dtype='f8')
b = np.zeros((1192953, 192), dtype='f8')
c = np.zeros((192, 32), dtype='f8')

a[:] = b[:, :, np.newaxis] - c[np.newaxis, :, :] # memory explodes here

Please note that initial memory allocations are done without any problems. However, when I try to perform the subtract operation with broadcasting, the memory grows to more than 100GB. I always thought that broadcasting would avoid making extra memory allocations but now I am not sure if this is always the case.

As such, can someone give some details on why this memory growth is happening, and how the following code could be rewritten using more memory efficient constructs?

I am running the code in Python 2.7 within IPython Notebook.

`c` is *created* with shape (1, 192, 32), so why do you index it as `c[np.newaxis, :, :]`? That creates a view with shape (1, 1, 192, 32). — Warren Weckesser, Jul 21 '15 at 12:07
Thanks for noticing - it was a typo when I was adapting the code to post here in SO — Cesar, Jul 21 '15 at 12:32

Warren Weckesser · Accepted Answer · 2015-07-21T12:18:07.047

8

@rth's suggestion to do the operation in smaller batches is a good one. You could also try using the function np.subtract and give it the destination array to avoid creating an addtional temporary array. I also think you don't need to index c as c[np.newaxis, :, :], because it is already a 3-d array.

So instead of

a[:] = b[:, :, np.newaxis] - c[np.newaxis, :, :] # memory explodes here

try

np.subtract(b[:, :, np.newaxis], c, a)

The third argument of np.subtract is the destination array.

edited Jul 21 '15 at 12:18

answered Jul 21 '15 at 12:12

Warren Weckesser

110,654
19
194
214

That's exactly what I was looking for. Thanks! – Cesar Jul 21 '15 at 12:14

rth · Answer 2 · 2015-07-21T13:53:20.783

Well, your array a takes already 1192953*192*32* 8 bytes/1.e9 = 58 GB of memory.

The broadcasting does not make additional memory allocations for the initial arrays, but the result of

b[:, :, np.newaxis] - c[np.newaxis, :, :]

is still saved in a temporary array. Therefore at this line, you have allocated at least 2 arrays with the shape of a for a total memory used >116 GB.

You can avoid this issue, by operating on a smaller subset of your array at one time,

CHUNK_SIZE = 100000
for idx in range(b.shape[0]/CHUNK_SIZE):
    sl = slice(idx*CHUNK_SIZE, (idx+1)*CHUNK_SIZE)
    a[sl] = b[sl, :, np.newaxis] - c[np.newaxis, :, :]

this will be marginally slower, but uses much less memory.

Many thanks! But so then it means that there is no built-in way within NumPy itself to store the result of this operation directly in 'a'? I suppose I am looking for something similar to C libraries where you can pass the destination matrix as an argument to the subtract function. — Cesar, Jul 21 '15 at 12:02

Memory growth with broadcast operations in NumPy

2 Answers2

Linked