5

I have a two-dimensional list like:

data = [[1,2,3], [2,3,4], [4,5,6]]

I want to write it to JSON file like this:

{
    'data':[
        [1,2,3],
        [2,3,4],
        [4,5,6]
    ]
}

I get this with: json.dumps(data, indent=4, sort_keys=True):

{
    'data':[
        [
         1,
         2,
         3
        ],
        [
         2,
         3,
         4
        ],
        [
         4,
         5,
         6]
    ]
}

Here is anther question How to implement custom indentation when pretty-printing with the JSON module?, but that's dictionaries.

martineau
  • 119,623
  • 25
  • 170
  • 301
Li Ziming
  • 385
  • 2
  • 5
  • 17

2 Answers2

9

I thought you could use my answer to another similar question to do what you want. While it works with json.dumps(), you pointed out that it didn't for some reason with json.dump().

After looking into the matter, I discovered that the encode() method of the derived json.JSONEncoder class that was being overridden in the linked answer, is only called when dumps() is called, but not when dump() is called.

Fortunately, I was able to determine that the iterencode() method does get invoked in both cases — so was able to fix the problem by more-or-less simply moving the code from encode() to iterencode().

The code immediately below is a revised version with this change in it:

Modified version of the code in my answer to other question:

from _ctypes import PyObj_FromPtr  # see https://stackoverflow.com/a/15012814/355230
import json
import re


class NoIndent(object):
    """ Value wrapper. """
    def __init__(self, value):
        if not isinstance(value, (list, tuple)):
            raise TypeError('Only lists and tuples can be wrapped')
        self.value = value


class MyEncoder(json.JSONEncoder):
    FORMAT_SPEC = '@@{}@@'  # Unique string pattern of NoIndent object ids.
    regex = re.compile(FORMAT_SPEC.format(r'(\d+)'))  # compile(r'@@(\d+)@@')

    def __init__(self, **kwargs):
        # Keyword arguments to ignore when encoding NoIndent wrapped values.
        ignore = {'cls', 'indent'}

        # Save copy of any keyword argument values needed for use here.
        self._kwargs = {k: v for k, v in kwargs.items() if k not in ignore}
        super(MyEncoder, self).__init__(**kwargs)

    def default(self, obj):
        return (self.FORMAT_SPEC.format(id(obj)) if isinstance(obj, NoIndent)
                    else super(MyEncoder, self).default(obj))

    def iterencode(self, obj, **kwargs):
        format_spec = self.FORMAT_SPEC  # Local var to expedite access.

        # Replace any marked-up NoIndent wrapped values in the JSON repr
        # with the json.dumps() of the corresponding wrapped Python object.
        for encoded in super(MyEncoder, self).iterencode(obj, **kwargs):
            match = self.regex.search(encoded)
            if match:
                id = int(match.group(1))
                no_indent = PyObj_FromPtr(id)
                json_repr = json.dumps(no_indent.value, **self._kwargs)
                # Replace the matched id string with json formatted representation
                # of the corresponding Python object.
                encoded = encoded.replace(
                            '"{}"'.format(format_spec.format(id)), json_repr)

            yield encoded

Applying it to your question:

# Example of using it to do get the results you want.

alfa = [('a','b','c'), ('d','e','f'), ('g','h','i')]
data = [(1,2,3), (2,3,4), (4,5,6)]

data_struct = {
    'data': [NoIndent(elem) for elem in data],
    'alfa': [NoIndent(elem) for elem in alfa],
}

print(json.dumps(data_struct, cls=MyEncoder, sort_keys=True, indent=4))

# Test custom JSONEncoder with json.dump()
with open('data_struct.json', 'w') as fp:
    json.dump(data_struct, fp, cls=MyEncoder, sort_keys=True, indent=4)
    fp.write('\n')  # Add a newline to very end (optional).

Resulting output:

{
    "alfa": [
        ["a", "b", "c"],
        ["d", "e", "f"],
        ["g", "h", "i"]
    ],
    "data": [
        [1, 2, 3],
        [2, 3, 4],
        [4, 5, 6]
    ]
}
martineau
  • 119,623
  • 25
  • 170
  • 301
  • Thanks, and why dumps can work, but dump to file directly did not work well? – Li Ziming Mar 10 '17 at 20:19
  • Not sure off-hand what you mean by "not so well". Will look into it when I have time. – martineau Mar 10 '17 at 20:49
  • I mean dunps dictionary to json and write in file, open the file and its look like print result as above shown. While use dump function to write dictionary in file directly, its show this:@@memory id@@. the dunp code is: with io.open("test.json", "w") as jsonfile. json.dump(data, jsonfile, indent=4, cls=MyEncodes). – Li Ziming Mar 10 '17 at 23:20
  • @LiZiming: I see what you mean now. The problem is, for some unknown reason, the `MyEncoder.encode()` method isn't being called when `json.dump()` is called as it is when `json.dumps()` is used. No idea why—especially given that it's about an answer I wrote 4+ years ago. Since overriding `encode()` is a fairly uncommon practice, it will likely take me a while to dig into this further—however, if I find a solution I will update my answer accordingly. – martineau Mar 10 '17 at 23:38
  • @LiZiming: There you go. Let me know if you have any further issues—the new version of the code wasn't very thoroughly tested. – martineau Mar 11 '17 at 09:59
  • If I have strings in list they are dumped in single quotes which cannot be read by json. Is there a workaround? – markroxor Apr 24 '19 at 12:30
  • 1
    @markroxor: Indeed, my linked answer had similar problem, which I corrected at some point in the last seven years, but those updates were never applied to this one—which itself is a couple of years old—until just now. Thanks for pointing out the issue. – martineau Apr 24 '19 at 18:52
  • Yes I checked your linked answer later, it works perfectly. – markroxor Apr 25 '19 at 05:01
  • @markroxor: Note that I made further modifications to enhance the implementation. – martineau Apr 25 '19 at 19:14
  • where do I inject type conversion if there are unserializable objects? in the past I have used a slightly modified default encoder, as explained here: https://stackoverflow.com/a/47626762/5238559. however, I don't even understand what your `default` function does. so where would I put something like `if isinstance(obj, np.intc): return int(obj)` – mluerig May 27 '21 at 16:37
  • @mluerig: The `MyEncoder.default()` method is checking for instances of the `NoIndent` class and changing them into specially formatted strings, otherwise they're passed on to the `default()` method of its super class (aka its base class). This is being done by returning the result of a [conditional expression](https://docs.python.org/3/reference/expressions.html#conditional-expressions). To check for additional types you could modify that expression to check for them, too. Another alternative would be to derive your own subclass from `MyEncoder` instead of `json.JSONEncoder`. – martineau May 27 '21 at 17:37
1

You just need to add it to a empty dict as :

data = [[1,2,3], [2,3,4], [4,5,6]]
a = {}
a.update({"data":data})
print a

#{'data': [[1, 2, 3], [2, 3, 4], [4, 5, 6]]}

What you are trying in the 1st style is just a dict format. To get exact json from that dict You can add this dict to your json.dump to dump the file.

For json format you just need to dump it as :

import json
b = json.dumps(a)
print b
#{"data": [[1, 2, 3], [2, 3, 4], [4, 5, 6]]}

You can go to pro.jsonlint.com and check whether the json format is correct or not.

Shivkumar kondi
  • 6,458
  • 9
  • 31
  • 58