I have a large json file (6GB)
which contains simple key and value pair like
{ "0546585b451000" : "5",
"0546585b451000111222" : "10"
}
I am using ijson
to parse this file and perform some operation on each object
I want to delete every object from json
file itself after completion of iteration.
with open(SOURCE_JSON_FILE, 'r') as fd:
parser = ijson.parse(fd)
for prefix, event, value in parser:
if event == 'number':
print('prefix={}, event={}, value={}'.format(prefix, event, value))
## Delete this row from json file now
My intention is to minimize the size of actual json
file so that if in case process breaks in between, i can continue with remaining keys.
What should be the approach to achieve this? apart from dumping done objects into another file or database.
help is appreciated