13

I am wondering why numpy.zeros takes up such little space?

x = numpy.zeros(200000000)

This takes up no memory while,

x = numpy.repeat(0,200000000)

takes up around 1.5GB. Does numpy.zeros create an array of empty pointers? If so, is there a way to set the pointer back to empty in the array after changing it in cython? If I use:

x = numpy.zeros(200000000)
x[0:200000000] = 0.0

The memory usage goes way up. Is there a way to change a value, and then change it back to the format numpy.zeros originally had it in python or cython?

user3266890
  • 465
  • 5
  • 15
  • 2
    How are you measuring the memory each array uses? If `x = numpy.zeros(200000000)` then `x.nbytes` is `1600000000` on my machine. For `x = numpy.repeat(0, 200000000)` half that memory is used (the `int32` datatype instead of the `float64` datatype is used when creating the array). – Alex Riley Dec 19 '14 at 22:44
  • 1
    [How do I determine the size of an object in Python?](http://stackoverflow.com/q/449560/2359271) (pay special attention to [this answer](http://stackoverflow.com/a/3373511/2359271)) – Air Dec 19 '14 at 23:06
  • I was just watching the change in the activity monitor – user3266890 Dec 19 '14 at 23:40
  • FYI: I think I just ran into the same thing on window 2012 server with python 3.7.2 and some recent numpy version. The interresting question is: what happens in case of `x[0:10]=1`. Does the whole array get allocated or only the required slice? Will check tomorrow :-) – Marti Nito Feb 19 '19 at 19:38

1 Answers1

23

Are you using Linux? Linux has lazy allocation of memory. The underlying calls to malloc and calloc in numpy always 'succeed'. No memory is actually allocated until the memory is first accessed.

The zeros function will use calloc which zeros any allocated memory before it is first accessed. Therfore, numpy need not explicitly zero the array and so the array will be lazily initialised. Whereas, the repeat function cannot rely on calloc to initialise the array. Instead it must use malloc and then copy the repeated to all elements in the array (thus forcing immediate allocation).

Dunes
  • 37,291
  • 7
  • 81
  • 97