1

I have two numpy a,b the shape of them are (100,2048), and I used sys.getsizeof(a) = 112 and same with array b.

I have question, when I use c = np.concatenate((a,b),axis=0), the shape of c is (200,2048), but the sys.getsizeof(c) = 1638512

Why?

HAO CHEN
  • 1,209
  • 3
  • 18
  • 32

2 Answers2

1

getsizeof has limited value. It can be way off for lists. For arrays it's better, but you have to understand how arrays are stored.

In [447]: import sys
In [448]: a = np.arange(100)
In [449]: sys.getsizeof(a)
Out[449]: 896

But look at the size of a view:

In [450]: b = a.reshape(10,10)
In [451]: sys.getsizeof(b)
Out[451]: 112

This shows the size of the array object, but not the size of the shared databuffer. b doesn't have its own databuffer.

In [453]: a.size
Out[453]: 100
In [454]: b.size
Out[454]: 100

So my guess is that your a and b are views of some other arrays. But the concatenate produces a new array with its own databuffer. It can't be a view of the other two. So its getsizeof reflects that.

In [457]: c = np.concatenate((a,b.ravel()))
In [459]: c.shape
Out[459]: (200,)
In [460]: c.size
Out[460]: 200
In [461]: sys.getsizeof(c)
Out[461]: 1696

The databuffer for a is 100*8 bytes, so the 'overhead' is 96. For c, 200*8, again with a 96 'overhead'.

hpaulj
  • 221,503
  • 14
  • 230
  • 353
0

It do not reproduce your example:

import numpy as np
import sys

a = np.random.rand(100, 2048)
b = np.random.rand(100, 2048)

print(sys.getsizeof(a), sys.getsizeof(b))
# 1638512 1638512

c = np.concatenate((a,b), axis=0)
print(sys.getsizeof(c))
# 3276912   which is about 1638512 + 1638512
xdze2
  • 3,986
  • 2
  • 12
  • 29
  • Thx, the code I used is from some one else, see:https://github.com/Maluuba/gensen. It's a sentence embedding, which can convert a string (sentence) into a 2048 dimension vector, so every 100 sentences have a (100,2048) vector. I also confused why the size of a is only 112, maybe they used some compress technique that I don't know. anyway, I convert to list and then convert back to numpy array it's normal, thanks. – HAO CHEN Sep 01 '18 at 15:49
  • here is some detail about `getsizeof` https://stackoverflow.com/a/17574104/8069403 , maybe it could help understand what happens – xdze2 Sep 01 '18 at 15:55