0

Background: I want to store a dict object in json format that has say, 2 entries:

(1) Some object that describes the data in (2). This is small data mostly definitions, parameters that control, etc. and things (maybe called metadata) that one would like to read before using the actual data in (2). In short, I want good human readability of this portion of the file.

(2) The data itself is a large chunk- should more like machine readable (no need for human to gaze over it on opening the file).

Problem: How to specify some custom indent, say 4 to the (1) and None to the (2). If I use something like json.dump(data, trig_file, indent=4) where data = {'meta_data': small_description, 'actual_data': big_chunk}, meaning the large data will have a lot of whitespace making the file large.

Peedaruos
  • 61
  • 7
  • If I understand your concern: ```data = {'meta_data': small_description, 'actual_data': big_chunk}```. Here, ```small_description``` will have some indent when it contains mostly few info about the ```big_chunk``` which would want to have indent None. – Peedaruos May 26 '22 at 15:44
  • Have you considered putting multiple distinct JSON documents in the same file, and making the formatting be global to each document? It would be much more maintainable code that way, and if your machine-readable content is either the first or last line folks who want to inspect or avoid it could just use `head` or `tail` appropriately. – Charles Duffy May 26 '22 at 16:20
  • There's precedent for that -- look at how Elastic Search splits content into description/result pairs (one JSON document saying what the next result _is_, then one json document with that result itself); you can keep things extensible by supporting that kind of open-ended format. – Charles Duffy May 26 '22 at 16:23

2 Answers2

0

Assuming you can append json to a file:

  1. Write {"meta_data":\n to the file.
  2. Append the json for small_description formatted appropriately to the file.
  3. Append ,\n"actual_data":\n to the file.
  4. Append the json for big_chunk formatted appropriately to the file.
  5. Append \n} to the file.

The idea is to do the json formatting out the "container" object by hand, and using your json formatter as appropriate to each of the contained objects.

Scott Hunter
  • 48,888
  • 12
  • 60
  • 101
0

Consider a different file format, interleaving keys and values as distinct documents concatenated together within a single file:

{"next_item": "meta_data"}
{
  "description": "human-readable content goes here",
  "split over": "several lines"
}
{"next_item": "actual_data"}
["big","machine-readable","unformatted","content","here","....."]

That way you can pass any indent parameters you want to each write, and you aren't doing any serialization by hand.

See How do I use the 'json' module to read in one JSON object at a time? for how one would read a file in this format. One of its answers wisely suggests the ijson library, which accepts a multiple_values=True argument.

Charles Duffy
  • 280,126
  • 43
  • 390
  • 441