3

Suppose that I have a large array:

A = numpy.arange(100000000)

and now I truncate it:

A = A[:10]

I used to think that, given that I don't have a name bound to the original A any more, its reference count has dropped to zero and it will get garbage-collected. However, A.base surreptitiously still refers to the original array! Does that mean that the only way to clear this up is by making an explicit copy, i.e.

A = A[:10].copy()

or is there some other way to, so to say, transfer primary ownership of the memory used to the new object, while the original can be garbage collected? I'm worried that this may be the source of subtle memory leaks in parts of my code.

(remotely related question: Memory-efficient way to truncate large array in Matlab)

gerrit
  • 24,025
  • 17
  • 97
  • 170

2 Answers2

2

When you do this:

A = A[:10]

You are returning a view on the original A (because it's slice indexing) and not creating a new array. So indeed, the original A is not freed because you still need it.

The proper way is indeed to create a copy, either with:

A = A[:10].copy()

Or:

A = np.array(A[:10])
Matthieu Brucher
  • 21,634
  • 7
  • 38
  • 62
  • Well, I only need the first 10 elements of the original `A`. I was hoping that in case `N` is large in `A = A[:N]`, I can somehow free up only the memory used by `A[N:]`, without needing to copy `A[:N]`. But maybe I can't. – gerrit Dec 04 '18 at 11:25
  • Unfortunately, not possible. The memory is allocated and there is no way to reallocate it smaller. Some other slicing would require just some rows and some columns, and there is no way to reallocate something like that :( – Matthieu Brucher Dec 04 '18 at 11:29
  • How about [`ndarray.resize`](https://stackoverflow.com/a/53616073/974555)? Is that not reallocating it smaller? – gerrit Dec 04 '18 at 15:24
  • it may, but as I've said, this may work in the case of a slice starting at 0, but not for all slices. – Matthieu Brucher Dec 04 '18 at 15:27
1

Per documentation:

All arrays generated by basic slicing are always views of the original array.

So, imagine your A is a Mona Lisa picture. And you set up a frame in front of it, so that it only contains Mona Lisa's head (when looked from the correct angle). If someone were to remove Mona Lisa, the "painting" of Mona Lisa's head in front of it would also disappear. You would need to copy what you see in the small frame to a new canvas to have an copy that would be safe against removal of the original.

You can verify this:

A = numpy.arange(100000000)
B = A[:10]
B[0] = 17
A[:5]
# => [17, 1, 2, 3, 4]

So you absolutely do need to copy in order to dissociate your new array from the original array. You can create a copy in a variety of ways. One is explicitly with copy, or with array constructor. You could also use advanced slicing, which doesn't return a view:

B = A[range(10)]
B[1] = 34
A[:5]
# => array([17,  1,  2,  3,  4])
B[:5]
# => array([17, 34,  2,  3,  4])
Amadan
  • 191,408
  • 23
  • 240
  • 301