I have a problem. I have a huge dict
. I want to save and load this huge dict. But unfortunately I got an MemoryError
. The dict should not be too big. What is read out of the database is around 4GB. I would now like to save this dict and read it out.
However, it should be efficient (not consume much more memory) and not take too long.
What options are there at the moment? I can't get any further with pickle
, I get a memory error. I have 200GB of free disk space left.
I looked at Fastest way to save and load a large dictionary in Python and some others questions and blogs.
import pickle
from pathlib import Path
def save_file_as_pickle(file, filename, path=os.path.join(os.getcwd(), 'dict')):
Path(path).mkdir(parents=True, exist_ok=True)
pickle.dump( file, open( os.path.join(path, str(filename+'.pickle')), "wb" ))
save_file_as_pickle(dict, "dict")
[OUT]
---------------------------------------------------------------------------
MemoryError Traceback (most recent call last)
<timed eval> in <module>
~\AppData\Local\Temp/ipykernel_1532/54965140.py in save_file_as_pickle(file, filename, path)
1 def save_file_as_pickle(file, filename, path=os.path.join(os.getcwd(), 'dict')):
2 Path(path).mkdir(parents=True, exist_ok=True)
----> 3 pickle.dump( file, open( os.path.join(path, str(filename+'.pickle')), "wb" ))
MemoryError:
What worked, but took 1 hour and 26GB space disk is used
with open('data.json', 'w', encoding='utf-8') as f:
json.dump(dict, f, ensure_ascii=False, indent=4)
I looked up how big my dict is in bytes. I came across this question How to know bytes size of python object like arrays and dictionaries? - The simple way and it shows that the dict is only 8448728 bytes.
import sys
sys.getsizeof(dict)
[OUT] 8448728
What my data looks like (example)
{
'_key': '1',
'group': 'test',
'data': {},
'type': '',
'code': '007',
'conType': '1',
'flag': None,
'createdAt': '2021',
'currency': 'EUR',
'detail': {
'selector': {
'number': '12312',
'isTrue': True,
'requirements': [{
'type': 'customer',
'requirement': '1'}]
}
}
'identCode': [],
}