I'm using ijson (https://pypi.python.org/pypi/ijson) to parse a large JSON file. It's several GBs, so I can't realistically store it all in memory. The issue is that somewhere in the middle of the file, the parser runs into an error (the specific exception is UnicodeDecodeError). I don't need every piece of data, so it's fine if I skip that entry, but I can't get it to continue past where the error is.
My code looks something like this:
parser = ijson.parse(file)
for prefix, event, value in parser:
do stuff
If I try to catch the exception inside the loop, it won't catch it because it gets the error in the parsing. If I put it outside the loop, I can't continue in where I left off (as far as I know). How can I get around this error and keep going? Alternatively, how can I fix the file in a way that doesn't require opening it or storing it in memory?