10

With Python, I'm saving json documents onto separate lines like this:

from bson import json_util # pymongo

with open('test.json', 'ab') as f:
    for document in documents:
       f.write(json_util.dumps(document)+'\n')

and then reading like this:

with open('test.json') as f:
    for line in f:
        document = json_util.loads(line)

The ease and simplicity make me think that there must be a gotcha? Is this all there is to linejson, aka jsonlines?

scharfmn
  • 3,561
  • 7
  • 38
  • 53
  • 1
    Are you experiencing actual problems? Can the JSON contain newlines? – Reut Sharabani Aug 27 '15 at 07:01
  • I'm trying to head off potential problems. I've tested it with json that contains newlines, and it seems to work fine. But I don't want to wake up a month from now and suddenly say: 'ohhh no'. – scharfmn Aug 27 '15 at 07:03

1 Answers1

4

Yes, that's all there is to it.

Cyphase
  • 11,502
  • 2
  • 31
  • 32
  • For long runs: if it gets interrupted in the middle of a write, and then restarted, the file is then corrupt. Is there a pattern for validating on each open (since I'm appending)? – scharfmn Aug 27 '15 at 08:09
  • 1
    Not necessarily, but certainly that could happen. There are different ways you could handle that; the best depends on the details of your usage. One thing you could do is make sure the reader code ignores invalid lines. You could also open the file read/write and check the last line before continuing to append data. – Cyphase Aug 27 '15 at 08:17
  • Thank you. So: for my log-file-like case: open read/write - check for final newline - if not there, add - then make first new write. Then have a `try/except` in read code for catching invalid documents. The combination would limit the damage to just the interrupted document(s). Does that sound right? – scharfmn Aug 27 '15 at 08:35
  • 1
    Sounds about right, yes. You don't even have to lose the interrupted document; just don't mark it as having been written until after `f.write(...); f.flush()` complete successfully. – Cyphase Aug 27 '15 at 09:01