Recently, I had data in a users.json
file which was taking a lot of time to load in VsCode as the file was too large (surprising to me because it was a 29mb
file), I wanted to use this chance to play around with pythons' memory usage, I loaded the file all into memory and it worked as expected.
Although I have a question, more of me needing an explanation, forgive me if its' answer is too obvious;
When I made an introspection on the loaded
json
object, I found out that the object size (1.3mb
) was way less that the file size (29.6mb
) on my file system (MacOS
), how could this be? The difference in size is just too much to ignore. To make things worse, i had a smaller file and that file returned similar size results (on-disk/loaded, ~358kb
), haha.
import json
with open('users.json') as infile:
data = json.load(infile)
print(f'Object Item Count: {len(data):,} items \nObject Size: {data.__sizeof__():,} bytes)
Using sys.getsizeof(data)
would return something similar, maybe with some gc
overhead.
This returns the accurate size of the file on disk (29586765
bytes, 29mb
)
from pathlib import Path
Path('users.json').stat().st_size
Please can someone explain to me what is happening, one would think that there should be similarity in size or maybe i'm wrong.