I am getting a multi-gigabyte json file from an S3 bucket and need to convert the json file into a python dictionary. How can this be done in python 2.7?
Asked
Active
Viewed 711 times
0
-
1Sure, if you have a 64 bit Python installation and many GB of RAM (or if the format is actually a bunch of lines of JSON, not a single JSON object). What have you tried? What is the exact data format? How much RAM do you have available? Why are you writing new code for Python 2.7 instead of moving to Python 3? – ShadowRanger Aug 28 '18 at 13:56
-
1I'm not sure why you want to do such a thing, you can use `ijson`[https://github.com/isagalaev/ijson]. But I bet its a line-by-line` json` file and not one big `json` object which makes things more reasonable (try `map(json.loads, lines)`) – Reut Sharabani Aug 28 '18 at 13:56
-
Did `json.load` actually fail with a memory error? – kabanus Aug 28 '18 at 13:56
-
I have yet to try an approach. Didn't want to follow down a path only to be stopped. @kabanus are you saying there is no limit to `json.load`? I can simply do `json.load(file.read())`? – Gary Holiday Aug 28 '18 at 13:59
-
@GaryHoliday I do not know. Something like `with open('giantfile.json') as fd: json.load(fd)` – kabanus Aug 28 '18 at 14:00
-
You need to `import json` if that was not clear. If it is easy to test I would try that first. – kabanus Aug 28 '18 at 14:00
-
Possible duplicate of [Is there a memory efficient and fast way to load big json files in python?](https://stackoverflow.com/questions/2400643/is-there-a-memory-efficient-and-fast-way-to-load-big-json-files-in-python) – snakecharmerb Aug 29 '18 at 05:24