45

I am pretty printing a json in Python using this code:

json.dumps(json_output, indent=2, separators=(',', ': ')

This prints my json like:

{    
    "rows_parsed": [
        [
          "a",
          "b",
          "c",
          "d"
        ],
        [
          "e",
          "f",
          "g",
          "i"
        ],
    ]
}

However, I want it to print like:

{    
    "rows_parsed": [
        ["a","b","c","d"],
        ["e","f","g","i"],
    ]
}

How can I keep the arrays that are in arrays all on one line like above?

Ben Sandler
  • 2,223
  • 5
  • 26
  • 36
  • 9
    Note that your desired output does not keep _all_ arrays on one line. – Matt Ball Oct 08 '14 at 19:22
  • Great point. Let me clarify my question. – Ben Sandler Oct 08 '14 at 19:23
  • 5
    (Easy:) consider `pprint`. (Hard:) consider writing a custom JSONEncoder and pass it as `cls` argument to `dumps`. (Obligatory:) think again why you need this all. – 9000 Oct 08 '14 at 19:24
  • Possible duplicate of [JSON dumps custom formatting](https://stackoverflow.com/questions/16264515/json-dumps-custom-formatting) – match Jan 11 '18 at 13:04
  • 1
    Do you want to keep "arrays that are in arrays" all on one line, or do you really want to keep *arrays that doesn't contain other arrays or dicts* on one line? The latter seems like a more natural thing to want. – Imperishable Night Jun 25 '19 at 07:56

2 Answers2

5

Here is a way to do it with as least amount of modifications as possible:

import json
from json import JSONEncoder
import re

class MarkedList:
    _list = None
    def __init__(self, l):
        self._list = l

z = {    
    "rows_parsed": [
        MarkedList([
          "a",
          "b",
          "c",
          "d"
        ]),
        MarkedList([
          "e",
          "f",
          "g",
          "i"
        ]),
    ]
}

class CustomJSONEncoder(JSONEncoder):
    def default(self, o):
        if isinstance(o, MarkedList):
            return "##<{}>##".format(o._list)

b = json.dumps(z, indent=2, separators=(',', ':'), cls=CustomJSONEncoder)
b = b.replace('"##<', "").replace('>##"', "")

print(b)

Basically the lists that you want formatted in that way you make instance of MarkedList and they get parsed as strings with hopefully unique enough sequence that is later stripped from the output of dumps. This is done to eliminate the quotes that are put around a json string.

Another much more efficient way to do it, but a much more ugly one is to monkey patch json.encoder._make_iterencode._iterencode with something like:

def _iterencode(o, _current_indent_level):
    if isinstance(o, str):
        yield _encoder(o)
    elif o is None:
        yield 'null'
    elif o is True:
        yield 'true'
    elif o is False:
        yield 'false'
    elif isinstance(o, int):
        # see comment for int/float in _make_iterencode
        yield _intstr(o)
    elif isinstance(o, float):
        # see comment for int/float in _make_iterencode
        yield _floatstr(o)
    elif isinstance(o, MarkedList):
        yield _my_custom_parsing(o)
    elif isinstance(o, (list, tuple)):
        yield from _iterencode_list(o, _current_indent_level)
    elif isinstance(o, dict):
        yield from _iterencode_dict(o, _current_indent_level)
    else:
        if markers is not None:
            markerid = id(o)
            if markerid in markers:
                raise ValueError("Circular reference detected")
            markers[markerid] = o
        o = _default(o)
        yield from _iterencode(o, _current_indent_level)
        if markers is not None:
            del markers[markerid]
Martin Gergov
  • 1,556
  • 4
  • 20
  • 29
  • 1
    `"##<{}>##".format(o._list)` doesn't work if the list contains None, and also renders single quotes which aren't valid json. – akxlr Jan 25 '22 at 07:01
1

I don't see how you could do it in the json.dumps. After a bit of searching I came across a few options: One option would be to do some post-processing with a custom function:

def fix_json_indent(text, indent=3):
            space_indent = indent * 4
    initial = " " * space_indent
    json_output = []
    current_level_elems = []
    all_entries_at_level = None  # holder for consecutive entries at exact space_indent level
    for line in text.splitlines():
        if line.startswith(initial):
            if line[space_indent] == " ":
                # line indented further than the level
                if all_entries_at_level:
                    current_level_elems.append(all_entries_at_level)
                    all_entries_at_level = None
                item = line.strip()
                current_level_elems.append(item)
                if item.endswith(","):
                    current_level_elems.append(" ")
            elif current_level_elems:
                # line on the same space_indent level
                # no more sublevel_entries 
                current_level_elems.append(line.strip())
                json_output.append("".join(current_level_elems))
                current_level_elems = []
            else:
                # line at the exact space_indent level but no items indented further
                if all_entries_at_level:
                    # last pending item was not the start of a new sublevel_entries.
                    json_output.append(all_entries_at_level)
                all_entries_at_level = line.rstrip()
        else:
            if all_entries_at_level:
                json_output.append(all_entries_at_level)
                all_entries_at_level = None
            if current_level_elems:
                json_output.append("".join(current_level_elems))
            json_output.append(line)
    return "\n".join(json_output)

Another possibility is a regex but it is quite ugly and depends on the structure of the code you posted:

def fix_json_indent(text):
    import re
    return  re.sub('{"', '{\n"', re.sub('\[\[', '[\n[', re.sub('\]\]', ']\n]', re.sub('}', '\n}', text))))
Oliwia
  • 19
  • 2