1

Here is my example:

import numpy as np
test = [np.random.choice(range(1, 1000), 1000000) for el in range(1,1000)]

this object takes in memory:

print(sys.getsizeof(test)/1024/1024/1024)
8.404254913330078e-06

something like 8 KB

When I write it to disk

import pickle
file_path = './test.pickle'
with open(file_path, 'wb') as f:
    pickle.dump(test, f)

it take almost 8GB from ls -l command

Could somebody clarify why it take so little space in memory and some much on disk? I am guessing in memory numbers are not accurate.

user1700890
  • 7,144
  • 18
  • 87
  • 183

1 Answers1

2

I am guessing in memory numbers are not accurate.

Well, this would not explain 6 orders of magnitude in size, right? ;)

test is a Python list instance. getsizeof will tell you the size of "a pointer", which is 64bit on your system together with some other attributes. But you will need to do a bit more to get all the stuff which is attached to this instance, inspecting each element (lists have no strict types in Python, so you can't simply do size_of_element * len(list) etc.).

Here is one resource: https://code.tutsplus.com/tutorials/understand-how-much-memory-your-python-objects-use--cms-25609

Here is another one: How do I determine the size of an object in Python?

tamasgal
  • 24,826
  • 18
  • 96
  • 135
  • @StefanPochmann uhm yep, in Python it's a bit more complicated. Objects are `PyObjects` having a reference counter and a type which points to the actual type, e.g. `PyIntObject` (a structural subtype), which again holds the actual value and another reference count etc. all this is well explained on the linked pages or in the Python docs. – tamasgal Jan 03 '18 at 22:33