0

I need to write an output in a JSON file which grows longer over time. What I have:

{ 
  "something": {  
     "foo" : "bar"
  }
}

And I use (spit "./my_file" my-text :append true).

With append, I include new entries, and it looks like this:

  {
    "something": { 
      "foo": "bar"
    }
  },
  {
    "something": {
      "foo": "bar"
     }
  },
}

My problem is that I need something like this:

[
  {
   "something": { 
       "foo": "bar"
      }
   },
   {
    "something": {
        "foo": "bar"
     }
   }
]

But I really do not know how to include new data within the [ ]

Charles Duffy
  • 280,126
  • 43
  • 390
  • 441
nimis
  • 3
  • 1
  • If you mean you want to append to the file and have it be valid JSON, with a closing `[` at all times, you can't do that in *any* language using only append operations, since your code needs to remove the prior trailing `]`, add a comma in its place, &c.; and those operations are outside the bounds of what you can do by merely appending So (1) this isn't really a Clojure-specific question, and (2) the best answer is going to depend on details of your use case. – Charles Duffy Jan 20 '18 at 16:22
  • If you need writes to be atomic, for example, then you have no choice but to go for the typical write-and-rename approach. If you can modify the reader (say, it's a script using `jq`), then you're better off having each *line* be an individual JSON document -- then the append approach will work just fine. – Charles Duffy Jan 20 '18 at 16:25
  • ...if you care more about performance than about correctness (if, f/e, it's acceptable to have the output file not be valid JSON if the writer is overwritten at the wrong time), then you can `seek()` the file pointer away from the end of the file before writing. Again, there's nothing Clojure-specific here -- you could have the same problem in any language, and you'd have about the same set of possible solutions. – Charles Duffy Jan 20 '18 at 16:27
  • Sorry for (1), I'm not an experienced programmer and I'm new here :( – nimis Jan 20 '18 at 16:29
  • And thank you for reply :) – nimis Jan 20 '18 at 16:29
  • Could you answer my questions about your priorities (atomicity of updates vs. performance with large files) and context (ie. ability to modify readers to accept a more easily-updated format)? – Charles Duffy Jan 20 '18 at 16:38
  • (If you care more about safety than speed, and can't change the format and modify your readers, then your question is effectively a duplicate of https://stackoverflow.com/questions/15208986/atomic-file-replacement-in-clojure: One *can't* use `:append true` in that case, and needs to rewrite the entire file contents on every update). – Charles Duffy Jan 20 '18 at 16:46
  • I'll explain the context: every time an event happens I need to add an entry in that file, which has a pattern to follow: [{data}]. At the moment I'm not worried about the performance of large files, I think it's the case of atomic writings. – nimis Jan 20 '18 at 16:51
  • Okay. I'm going to add an answer that's appropriate if you cared more about speed than atomicity, since I've already started one, but you shouldn't use it -- instead, you'll want to use the approach in the answer to the other question I linked above. – Charles Duffy Jan 20 '18 at 17:00

1 Answers1

1

If you want to perform updates in-place -- meaning you care more about performance than safety -- this can be accomplished using java.io.RandomAccessFile:

(import '[java.io RandomAccessFile])

(defn append-to-json-list-in-file [file-name new-json-text]
  (let [raf (RandomAccessFile. file-name "rw")
        lock (.lock (.getChannel raf))    ;; avoid concurrent invocation across processes
        current-length (.length raf)]
    (if (= current-length 0)
      (do
        (.writeBytes raf "[\n")           ;; On the first write, prepend a "["
        (.writeBytes raf new-json-text)   ;; ...before the data...
        (.writeBytes raf "\n]\n"))        ;; ...and a final "\n]\n"
      (do
        (.seek raf (- current-length 3))  ;; move to before the last "\n]\n"
        (.writeBytes raf ",\n")           ;; put a comma where that "\n" used to be
        (.writeBytes raf new-json-text)   ;; ...then the new data...
        (.writeBytes raf "\n]\n")))       ;; ...then a new "\n]\n"
    (.close lock)
    (.close raf)))

As an example of usage -- if no preexisting out.txt exists, then the result of the following three calls:

(append-to-json-list-in-file "out.txt" "{\"hello\": \"birds\"}")
(append-to-json-list-in-file "out.txt" "{\"hello\": \"trees\"}")
(append-to-json-list-in-file "out.txt" "{\"goodbye\": \"world\"}")

...will be a file containing:

[
{"hello": "birds"},
{"hello": "trees"},
{"goodbye": "world"}
]

Note that the locking prevents multiple processes from calling this code at once with the same output file. It doesn't provide safety from multiple threads in the same process doing concurrent invocations -- if you want that, I'd suggest using an Agent or other inherently-single-threaded construct.

There's also some danger that this could corrupt a file that has been edited by other software -- if a file ends with "\n]\n\n\n" instead of "\n]\n", for example, then seeking to three bytes before the current length would put us in the wrong place, and we'd generate malformed output.

If instead you care more about ensuring that output is complete and not corrupt, the relevant techniques are not JSON-specific (and call for rewriting the entire output file, rather than incrementally updating it); see Atomic file replacement in Clojure.

Charles Duffy
  • 280,126
  • 43
  • 390
  • 441