6
import pandas as pd
with open(r'data.json') as f:
   df = pd.read_json(f, encoding='utf-8')

I'm getting a "Could not reserve memory block" error. The JSON file is 300MB in size. Is there any limit for reserving memory for a running program in Python? I have 8GB RAM on PC, using Windows 10.

loading of json file into df
Traceback (most recent call last):
  File "C:\Program Files\JetBrains\PyCharm 2018.1.4\helpers\pydev\pydev_run_in_console.py", line 52, in run_file
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "C:\Program Files\JetBrains\PyCharm 2018.1.4\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "C:/Users/Beorn/PycharmProjects/project_0/projekt/test.py", line 7, in <module>
    df = pd.read_json(f, encoding='utf-8')
  File "C:\Users\Beorn\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\io\json\json.py", line 422, in read_json
    result = json_reader.read()
  File "C:\Users\Beorn\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\io\json\json.py", line 529, in read
    obj = self._get_object_parser(self.data)
  File "C:\Users\Beorn\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\io\json\json.py", line 546, in _get_object_parser
    obj = FrameParser(json, **kwargs).parse()
  File "C:\Users\Beorn\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\io\json\json.py", line 638, in parse
    self._parse_no_numpy()
  File "C:\Users\Beorn\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\io\json\json.py", line 853, in _parse_no_numpy
    loads(json, precise_float=self.precise_float), dtype=None)
ValueError: Could not reserve memory block
PyDev console: starting.
Python 3.6.6 (v3.6.6:4cf1f54eb7, Jun 27 2018, 02:47:15) [MSC v.1900 32 bit (Intel)] on win32
Michael Mior
  • 28,107
  • 9
  • 89
  • 113
Beorn
  • 417
  • 1
  • 5
  • 13
  • I have never seen that error, please show the full traceback. – roganjosh Jul 14 '18 at 09:05
  • 1
    did you install 32 bit version of python or 64 bit? – Jean-François Fabre Jul 14 '18 at 09:07
  • mmm, my searches seem to indicate there's more at play here. Are you using Apache? – roganjosh Jul 14 '18 at 09:10
  • 32 bit, I belive I don't use Apache – Beorn Jul 14 '18 at 09:15
  • When Pandas reads a JSON file, it loads it into memory using Python's `json` module (the relevant line from your stacktrace is https://github.com/pandas-dev/pandas/blob/v0.23.3/pandas/io/json/json.py#L853, which calls through to `json.loads`). If loading your data into a plain dictionary with `json.load()` gives the same error, then you can rule out Pandas the problem. There are a few questions on SO about problems with loading large JSON files (e.g. https://stackoverflow.com/questions/10382253/reading-rather-large-json-files-in-python) - so this is the first thing to try. – jelford Jul 14 '18 at 09:44

1 Answers1

2

So after reading plenty of posts and solutions, I decided to just reduce my file size by getting rid of useless data. Maybe you will find this useful. By the way, I read somewhere that you need at least 25x more memory than your JSON file has, so in my case, I needed more than 8GB.

with open('data.json', 'r') as data_file:
    data = json.load(data_file)

print(data.keys())
del data['author']

with open('datav2.json', 'w') as data_file:
    data = json.dump(data, data_file)
Michael Mior
  • 28,107
  • 9
  • 89
  • 113
Beorn
  • 417
  • 1
  • 5
  • 13