1

I have a DataFrame with lists in one column. I want to pretty print the data as JSON.

How can I use indentation without affecting the values in each cell to be indented.

An example:

df = pd.DataFrame(range(3))
df["lists"] = [list(range(i+1)) for i in range(3)]
print(df)

output:

   0      lists
0  0        [0]
1  1     [0, 1]
2  2  [0, 1, 2]

Now I want to print the data as JSON using:

print(df.to_json(orient="index", indent=2))

output:

{
  "0":{
    "0":0,
    "lists":[
      0
    ]
  },
  "1":{
    "0":1,
    "lists":[
      0,
      1
    ]
  },
  "2":{
    "0":2,
    "lists":[
      0,
      1,
      2
    ]
  }
}

desired output:

{
  "0":{
    "0":0,
    "lists":[0]
  },
  "1":{
    "0":1,
    "lists":[0,1]
  },
  "2":{
    "0":2,
    "lists":[0,1,2]
  }
}
Nico G.
  • 487
  • 5
  • 12
  • They are the same. – BrainFl Apr 01 '22 at 13:53
  • To a computer they are the same. To a human the use of too much indentation is contra productive. – Nico G. Apr 01 '22 at 13:56
  • Duplicate: https://stackoverflow.com/questions/26264742/pretty-print-json-but-keep-inner-arrays-on-one-line-python – braml1 Apr 01 '22 at 14:47
  • @braml1 I would argue this is a different question, because it's specifically about `pandas.DataFrame.to_json`. But since it's hard to achieve even with just the JSON package, I don't have high hopes to get an answer here. – Nico G. Apr 01 '22 at 15:49
  • As you probably also found: the `pandas.DataFrame.to_json` API (https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_json.html) does not support the functionality you are looking for – braml1 Apr 01 '22 at 16:01
  • Still the duplicate of it. you can still use them to do it. – BrainFl Apr 02 '22 at 01:37

1 Answers1

1

If you don't want to bother with json format output, you can just turn the list type to string temporarily when printing the dataframe

print(df.astype({'lists':'str'}).to_json(orient="index", indent=2))
{
  "0":{
    "0":0,
    "lists":"[0]"
  },
  "1":{
    "0":1,
    "lists":"[0, 1]"
  },
  "2":{
    "0":2,
    "lists":"[0, 1, 2]"
  }
}

If you don't want to see the quote mark, you use regex to replace them

import re

import re
result = re.sub(r'("lists":)"([^"]*)"', r"\1 \2",
                df.astype({'lists':'str'}).to_json(orient="index", indent=2))
{
  "0":{
    "0":0,
    "lists": [0]
  },
  "1":{
    "0":1,
    "lists": [0, 1]
  },
  "2":{
    "0":2,
    "lists": [0, 1, 2]
  }
}
Ynjxsjmh
  • 28,441
  • 6
  • 34
  • 52