1

One would think doing this is easy. I swear it is, just... I can't seem to figure it out. How do I transform:

terrible_way_to_describe_nested_json=['a.b.c','a.b.d','a.e','a.f','g.h']

into

{
    "a": {
        "b": {
            "c": None,
            "d": None
        },
        "e": None,
        "f": None
    },
    "g": {
        "h": None
    }
}

If you would consider 'a.b.c' a path of a deconstructed JSON load, then I have about 200 of these unsorted paths (transformation should work regardless of order) that go up to 8 dots deep all eagerly hoping to become part of their original structure. I've tried approaching this using recursion, pandas to sort leaf nodes from internal ones (ridiculous?), crazy list of lists of dictionaries of lists who knows, even autovivification.

Field of ruin and despair

Here's one of 6 partial implementations I've written/abandoned. It goes as far as peeling back the layer of nested keys right before the fringe nodes then I loose my mind. I'd almost recommend ignoring it.

def dot_to_json(dotted_paths):
    scope_map=[line.split('.') for line in dotted_paths] #Convert dots list to strings
    # Sort and group list of strings according to length of list. longest group is last
    b=[]
    for index in range(max([len(x) for x in scope_map])+1):
        a=[]
        for item in scope_map:
            if len(item)==index:
                a.append(item)
        b.append(a)
    sorted_nest=[x for x in b if x] # finally group according to list length
    #Point AA
    # group string list 'prefix' with key:value
    child_path=[]
    for item in sorted_nest[-1]:
        child_path.append([item[:-1],{item[-1]:None}])
    # peel back a layer
    new_child_path=[]
    for scope in scope_map[-2]:
        value=None # set value to None if fringe node
        for index, path in enumerate(child_path):
            if path[0]==scope: # else, save key + value as a value to the new prefix key
                value=path[1]
                child_path.pop(index) # 'move' this path off child_path list
        new_child_path.append([scope[:-1],{scope[-1]:value}])
    new_child_path+=child_path
    #Point BB...
    #Loop in some intelligent way between Point AA and Point BB
    return new_child_path
#%%
dotted_json=['a.b.c','a.b.d','a.e','a.f','g.h']

scope_map=dot_to_json(dotted_json)
Community
  • 1
  • 1
zelusp
  • 3,500
  • 3
  • 31
  • 65

2 Answers2

3

Here you go:

In [5]: terrible_way_to_describe_nested_json=['a.b.c','a.b.d','a.e','a.f','g.h']

In [6]: terrible_way_to_describe_nested_json = [s.split('.') for s in terrible_way_to_describe_nested_json]

In [7]: data = {}

In [8]: for path in terrible_way_to_describe_nested_json:
   ....:     curr = data
   ....:     for i, node in enumerate(path):
   ....:         if i == len(path) - 1:
   ....:             curr[node] = None
   ....:         else:
   ....:             curr = curr.setdefault(node,{})
   ....:             

In [9]: data
Out[9]: {'a': {'b': {'c': None, 'd': None}, 'e': None, 'f': None}, 'g': {'h': None}}

Now pretty printing using the json module gives:

{
    "a": {
        "f": null,
        "b": {
            "d": null,
            "c": null
        },
        "e": null
    },
    "g": {
        "h": null
    }
}

Those two should be equivalent but out of order, or at least, printed out of order.

juanpa.arrivillaga
  • 88,713
  • 10
  • 131
  • 172
  • You should at least split the individual paths on dot (`.`) instead of iterating on their characters... – Serge Ballesta Jun 29 '16 at 09:14
  • WHOOPS forgot to add that! It was a pre-processing step. Sorry, it's late in the AM over here. – juanpa.arrivillaga Jun 29 '16 at 09:15
  • @SergeBallesta I had done that originally. I edited the answer to reflect that step. I also changed the implementation to use enumerate because it is more pythonic and it improves readability. – juanpa.arrivillaga Jun 29 '16 at 09:22
  • @juanpa.arrivillaga - will you enter a monogamous coding relationship with me? Thank you! In all seriousness, though, this worked like a charm once I changed the `None` assignment to `{}` – zelusp Jun 29 '16 at 18:44
  • @zelusp No problem. If you wanted empty dictionaries in place of None you should have said so! For this, you can delete the conditional branching inside the inner for-loop (the if-else) and replace it with `curr = curr.setdefault(node,{})` – juanpa.arrivillaga Jun 29 '16 at 20:10
  • Either would have been fine except that setting a fringe node with your as written code to `None` threw a `TypeError: 'NoneType' object does not support item assignment`. Using `{}` "fixed" this – zelusp Jun 29 '16 at 21:09
  • @zelusp Huh? It shouldn't. That means you were trying to assign already to a fringe node. Are you sure you used `if i == len(path) - 1`? The code I used worked exactly as written, python 3.5.1 – juanpa.arrivillaga Jun 29 '16 at 21:11
  • ... which prompts me to speculate if something is getting assigned where it shouldn't be but I've reviewed the output and fail to see a glaring issue – zelusp Jun 29 '16 at 21:11
  • Yes. I've run your script verbatim. If interested I could pickle and share the data object I'm working with if you'd like a look. – zelusp Jun 29 '16 at 21:12
  • 1
    Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/116021/discussion-between-juanpa-arrivillaga-and-zelusp). – juanpa.arrivillaga Jun 29 '16 at 21:14
0

Try out the dpath module, it can be used to add/filter/search dictionaries in an easy way. By replacing the dots with '/' characters you can create the dict you need.

Gábor Fekete
  • 1,343
  • 8
  • 16