25

I'd like to dump a Python dictionary into a JSON file with a particular custom format. For example, the following dictionary my_dict,

'text_lines': [{"line1"}, {"line2"}]

dumped with

f.write(json.dumps(my_dict, sort_keys=True, indent=2))

looks like this

  "text_lines": [
    {
      "line1"
    }, 
    {
      "line2"
    }
  ]

while I prefer that it looks like this

  "text_lines": 
  [
    {"line1"}, 
    {"line2"}
  ]

Similarly, I want the following

  "location": [
    22, 
    -8
  ]

to look like this

  "location": [22, -8]

(that is, more like a coordinate, which it is).

I know that this is a cosmetic issue, but it's important to me to preserve this formatting for easier hand editing of the file.

Any way of doing this kind of customisation? An explained example would be great (the docs did not get me very far).

Saar Drimer
  • 1,171
  • 2
  • 11
  • 24
  • 1
    That, BTW, is not a valid JSON ... did it even work? – UltraInstinct Apr 28 '13 at 15:56
  • 1
    Not to belabor the point but your json dicts still aren't valid. (ie each dict needs a key and value: {"line1": "value1}). Did you ever figure this out? I'm not sure how to use the JSONEncoder to do this. – Tim Ludwinski Oct 22 '14 at 15:11

3 Answers3

16

I have used the example provided by Tim Ludwinski and adapted it to my preference:

class CompactJSONEncoder(json.JSONEncoder):
    """A JSON Encoder that puts small lists on single lines."""

    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.indentation_level = 0

    def encode(self, o):
        """Encode JSON object *o* with respect to single line lists."""

        if isinstance(o, (list, tuple)):
            if self._is_single_line_list(o):
                return "[" + ", ".join(json.dumps(el) for el in o) + "]"
            else:
                self.indentation_level += 1
                output = [self.indent_str + self.encode(el) for el in o]
                self.indentation_level -= 1
                return "[\n" + ",\n".join(output) + "\n" + self.indent_str + "]"

        elif isinstance(o, dict):
            self.indentation_level += 1
            output = [self.indent_str + f"{json.dumps(k)}: {self.encode(v)}" for k, v in o.items()]
            self.indentation_level -= 1
            return "{\n" + ",\n".join(output) + "\n" + self.indent_str + "}"

        else:
            return json.dumps(o)

    def _is_single_line_list(self, o):
        if isinstance(o, (list, tuple)):
            return not any(isinstance(el, (list, tuple, dict)) for el in o)\
                   and len(o) <= 2\
                   and len(str(o)) - 2 <= 60

    @property
    def indent_str(self) -> str:
        return " " * self.indentation_level * self.indent
    
    def iterencode(self, o, **kwargs):
        """Required to also work with `json.dump`."""
        return self.encode(o)

Also see the version I have in use.

jmm
  • 384
  • 2
  • 12
  • 2
    I really like your solution, especially the one in the gist, thank you. – faysou Aug 27 '20 at 21:34
  • 1
    Would you please post a version that works with json.dump? It will have to provide/override the iterencode() method. – chrisinmtown Aug 26 '21 at 14:57
  • @chrisinmtown I have updated the linked gist to include your `iterencode` suggestion. So far, this solves the `json.dump` vs. `json.dumps` discrepancy for me :-) – jmm Aug 27 '21 at 11:16
  • @jmm any idea how to include ensure_ascii=False in that class? whenever i run `json.dump(file, f, indent=2, ensure_ascii=False, cls=CompactJSONEncoder)` the `ensure_ascii=False` part does not work – ZZZ Oct 28 '22 at 09:29
  • 1
    @ZZZ See the linked version, where I keep an updated version of this encoder. Should be working with the most recent version :) – jmm Oct 28 '22 at 11:45
9

Here's something that I hacked together. Not very pretty but it seems to work. You could probably handle simple dictionaries in a similar way.

class MyJSONEncoder(json.JSONEncoder):
    def __init__(self, *args, **kwargs):
        super(MyJSONEncoder, self).__init__(*args, **kwargs)
        self.current_indent = 0
        self.current_indent_str = ""

    def encode(self, o):
        #Special Processing for lists
        if isinstance(o, (list, tuple)):
            primitives_only = True
            for item in o:
                if isinstance(item, (list, tuple, dict)):
                    primitives_only = False
                    break
            output = []
            if primitives_only:
                for item in o:
                    output.append(json.dumps(item))
                return "[ " + ", ".join(output) + " ]"
            else:
                self.current_indent += self.indent
                self.current_indent_str = "".join( [ " " for x in range(self.current_indent) ])
                for item in o:
                    output.append(self.current_indent_str + self.encode(item))
                self.current_indent -= self.indent
                self.current_indent_str = "".join( [ " " for x in range(self.current_indent) ])
                return "[\n" + ",\n".join(output) + "\n" + self.current_indent_str + "]"
        elif isinstance(o, dict):
            output = []
            self.current_indent += self.indent
            self.current_indent_str = "".join( [ " " for x in range(self.current_indent) ])
            for key, value in o.items():
                output.append(self.current_indent_str + json.dumps(key) + ": " + self.encode(value))
            self.current_indent -= self.indent
            self.current_indent_str = "".join( [ " " for x in range(self.current_indent) ])
            return "{\n" + ",\n".join(output) + "\n" + self.current_indent_str + "}"
        else:
            return json.dumps(o)

NOTE: It's pretty much unnecessary in this code to be inheriting from JSONEncoder.

danvk
  • 15,863
  • 5
  • 72
  • 116
Tim Ludwinski
  • 2,704
  • 30
  • 34
  • 1
    Note that you can replace `"".join( [ " " for x in range(self.current_indent) ])` with `" " * self.current_indent` ;-) – Johan Jan 09 '17 at 13:01
  • In case anyone was wondering: this works great for pretty-printing GeoJSON. – danvk Oct 25 '18 at 17:44
3

You will need to create a subclass of the json.JSONEncoder class and override the methods for each type of value so they write the format you need. You may end up re-implementing most of them, depending on what your formatting needs are.

http://docs.python.org/2/library/json.html has an example for extending the JSONEncoder.