3

I came across the following piece of code while studying Numpy:

import numpy as np

import time
import sys
S= range(1000)
print(sys.getsizeof(5)*len(S))

D= np.arange(1000)
print(D.size*D.itemsize)

The output of this is:

O/P -  14000

4000

So Numpy saves memory storage. But I want to know how does Numpy do it?

Source: https://www.edureka.co/blog/python-numpy-tutorial/

Edit: This question only answers half of my question. Doesn't mention anything regarding what the Numpy module does.

Mithil Bhoras
  • 694
  • 1
  • 7
  • 22

2 Answers2

0

In your example, D.size == len(S), so the difference is due to the difference between D.itemsize (8) and sys.getsizeof(5) (28).

D.dtype shows you that NumPy used int64 as the data type, which uses (unsurprisingly) 64 bits == 8 bytes per item. This is really only the raw numerical data, similar to a data type in C (under the hood it pretty much is exactly that).

In contrast, Python uses an int for storing the items, which (as pointed out the question linked to by FlyingTeller) is more than just the raw numerical data.

Florian Brucker
  • 9,621
  • 3
  • 48
  • 81
0

A ndarray stores its data in a contiguous data buffer

For an example in my current ipython session:

In [63]: x.shape
Out[63]: (35, 7)
In [64]: x.dtype
Out[64]: dtype('int64')
In [65]: x.size
Out[65]: 245
In [66]: x.itemsize
Out[66]: 8
In [67]: x.nbytes
Out[67]: 1960

The array referenced by x has a block of memory with info like shape and strides, and this data buffer that takes up 1960 bytes.

Identifying the memory use of a list, e.g. xl = x.tolist() is trickier. len(xl) is 35, that is, it's databuffer has 35 pointers. But each pointer references a different list of 7 elements. Each of those lists has pointers to numbers. In my example the numbers are all integers less than 255, so each is unique (repeats point to the same object). For larger integers and floats there will be a separate Python object for each. So the memory footprint of a list depends on the degree of nesting as well as the type of the individual elements.

ndarray can also have object dtype, in which case it too contains pointers to objects elsewhere in memory.

And another nuance - the primary pointer buffer of a list is slightly oversized, to make append faster.

hpaulj
  • 221,503
  • 14
  • 230
  • 353