8

When serializing with Python's json module, the dump function is not adding a newline character at the end of the line:

import json


data = {'foo': 1}
json.dump(data, open('out.json', 'w'))

We can check that using wc:

$ wc -l out.json
0 out.json

Why is it doing that? Considering that:

Peque
  • 13,638
  • 11
  • 69
  • 105
  • 1
    See https://codeyarns.com/2017/02/22/python-json-dump-misses-last-newline/ – liamhawkins Feb 15 '19 at 19:49
  • 3
    On the contra side, it should dump *only* the JSON value and not care about surrounding convention regarding files. What if the file-like object was a socket instead? – deceze Feb 15 '19 at 19:51
  • @liamhawkins That seems to be a workaround (which I knew about). I would like to know if there is a reason for that behavior, not how to avoid it. – Peque Feb 15 '19 at 19:51
  • 4
    "The serialized JSON is a text file and text files should end with a newline".. No, a serialized JSON is just a sequence of text, not a text file. There's no requirement for a sequence of text to end with a newline. That's all. – blhsing Feb 15 '19 at 19:54
  • There is also no requirement for a text file to end with a newline. – kindall Feb 15 '19 at 19:57
  • @blhsing Please, provide an answer instead of a comment to be able to upvote/accept it. – Peque Feb 15 '19 at 19:57
  • @Peque Done as requested then. Thanks. – blhsing Feb 15 '19 at 20:02
  • @kindall In Windows not. But Posix says there is such a requirement. – BoarGules Feb 15 '19 at 20:07
  • @kindall POSIX does [require that](http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html). See definitions 3.206 (defining a line) and 3.403 (defining a text file). – chepner Feb 15 '19 at 20:08
  • If the OS does not prevent it from happening, then there is no such requirement. Nobody cares whether their text files are POSIX-compliant, and makers of tools need to handle files without a terminating newline. tl;dr "You are technically correct, the best kind of correct." – kindall Feb 15 '19 at 20:10
  • @kindall Tools like `wc`, whose behavior is defined for conforming text files but not arbitarary files, care. – chepner Feb 15 '19 at 20:11
  • Arguably JSON isn’t a *text file*. Yes, it contains text, but first and foremost it contains machine readable data; which incidentally is also human readable to varying degrees. – deceze Feb 15 '19 at 20:13
  • If I, as a user, do not get the answer I expect from a tool like `wc`, then that tool is broken to me. Expecting *users* to need to know about POSIX standards in order to use basic functionality is also broken. – kindall Feb 15 '19 at 20:13
  • Also, from the POSIX spec for `wc`, the definition of `-l` is to "[w]rite to the standard output the number of characters in each input file." – chepner Feb 15 '19 at 20:13
  • @kindall You are free to use ambiguously defined tools if you like. I prefer predictable tools. – chepner Feb 15 '19 at 20:15
  • 1
    @Peque Note also that `json.dump` does not claim to write a text file; it simply writes a JSON serialization of its first argument to a file-like object. – chepner Feb 15 '19 at 20:17
  • @chepner Yeah, now I see it very clear, thanks. ^^ – Peque Feb 15 '19 at 20:23

1 Answers1

14

A serialized JSON is just a sequence of text, not a text file, and there's no requirement for a sequence of text to end with a newline, so the json.dump method is right to produce an output without additional white space characters outside the boundary of the JSON object itself. In many cases such as sending the JSON object over a socket (as pointed out by @deceze in the comments), a newline would be entirely unnecessary, so it's up to the caller the decide whether or not a trailing newline is appropriate for the application.

blhsing
  • 91,368
  • 6
  • 71
  • 106
  • 1
    You could mention, in particular, @deceze's point (a JSON that could be written to a socket instead of a text file in disk). – Peque Feb 15 '19 at 20:04
  • 1
    Done as requested then. Thanks. – blhsing Feb 15 '19 at 20:07
  • Except in the jsonl case. In that case every line (including the last) should have a newline. Pandas' `to_json` does not (as of 1.1.0 at least) – Leo Oct 05 '20 at 12:49