I have to save a significantly large python data into the a mysql database which consists of lists and dictionaries but I get memory exception during the save operation.
I have already benchmarked the saving operation and also tried different ways of dumping the data, including binary format but all methods seemed to consume a lot of memory. Benchmarks below:
MAX MEMORY USAGE DURING JSON SAVE: 966.83MB
SIZE AFTER DUMPING json: 81.03 MB pickle: 66.79 MB msgpack: 33.83 MB
COMPRESSION TIME: json: 5.12s pickle: 11.17s msgpack: 0.27s
DECOMPRESSION TIME: json: 2.57s pickle: 1.66s msgpack: 0.52s
COMPRESSION MAX MEMORY USAGE: json dumping: 840.84MB pickle: 1373.30MB msgpack: 732.67MB
DECOMPRESSION MAX MEMORY USAGE: json: 921.41MB pickle: 1481.25MB msgpack: 1006.12MB
msgpack seems to be the most performant library but the decompression takes up a lot of memory too. I also tried hickle which is said to consume little memory but the final size ended up being 800MB.
Does anyone have a suggestion? Should I just increase the memory limit? can mongodb handle the save operation with less memory?
find below the stacktrace
Traceback (most recent call last):
File "/opt/python/bundle/32/app/web_platform/kernel/kernel_worker/web_platform/call_kernel.py", line 139, in start_simulation
simulation_job_object.save()
File "/opt/python/bundle/32/app/web_platform/kernel/kernel_worker/web_platform/models.py", line 172, in save
self.clean_fields()
File "/opt/python/run/venv/local/lib/python3.6/site-packages/django/db/models/base.py", line 1223, in clean_fields
setattr(self, f.attname, f.clean(raw_value, self))
File "/opt/python/run/venv/local/lib/python3.6/site-packages/django/db/models/fields/__init__.py", line 630, in clean
self.validate(value, model_instance)
File "/opt/python/run/venv/local/lib/python3.6/site-packages/jsonfield/fields.py", line 54, in validate
self.get_prep_value(value)
File "/opt/python/bundle/32/app/web_platform/kernel/kernel_worker/web_platform/models.py", line 156, in get_prep_value
return json.dumps(value, **self.encoder_kwargs)
File "/usr/lib64/python3.6/json/__init__.py", line 231, in dumps
return _default_encoder.encode(obj)
File "/usr/lib64/python3.6/json/encoder.py", line 202, in encode
return ''.join(chunks)
MemoryError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/python/run/venv/local/lib/python3.6/site-packages/background_task/tasks.py", line 43, in bg_runner
func(*args, **kwargs)
File "/opt/python/bundle/32/app/web_platform/kernel/kernel_worker/web_platform/call_kernel.py", line 157, in start_simulation
simulation_job_object.save()
File "/opt/python/bundle/32/app/web_platform/kernel/kernel_worker/web_platform/models.py", line 172, in save
self.clean_fields()
File "/opt/python/run/venv/local/lib/python3.6/site-packages/django/db/models/base.py", line 1223, in clean_fields
setattr(self, f.attname, f.clean(raw_value, self))
File "/opt/python/run/venv/local/lib/python3.6/site-packages/django/db/models/fields/__init__.py", line 630, in clean
self.validate(value, model_instance)
File "/opt/python/run/venv/local/lib/python3.6/site-packages/jsonfield/fields.py", line 54, in validate
self.get_prep_value(value)
File "/opt/python/bundle/32/app/web_platform/kernel/kernel_worker/web_platform/models.py", line 156, in get_prep_value
return json.dumps(value, **self.encoder_kwargs)
File "/usr/lib64/python3.6/json/__init__.py", line 231, in dumps
return _default_encoder.encode(obj)
File "/usr/lib64/python3.6/json/encoder.py", line 199, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/usr/lib64/python3.6/json/encoder.py", line 257, in iterencode
return _iterencode(o, 0)
MemoryError
example of my code
class Calculation(Model):
name = db_models.CharField(max_length=120)
results = JsonNullField(blank=True, null=True)
results = run_calculation()
calculation = Calculation(name="calculation", results=results)
calculation.save()