4

I have a defaultdict(list) and I used simplejson.dumps(my_defaultdict) in order to output the defaultdict into a JSON format. I am using the HTML code for dendogram from http://bl.ocks.org/mbostock/4063570 but I am trying to make my defaultdict information into the format of the JSON file the author is using. This JSON file is named: /mbostock/raw/4063550/flare.JSON and it's found in this link: http://bl.ocks.org/mbostock/raw/4063550/flare.json.

So here is my defaultdict data:

my_defaultdict = {5: ['child10'], 45: ['child92', 'child45'], 33:['child38']}

json_data = simplejson.dumps(my_defaultdict)

so my current json_data looks like this:

{
"5": [
        "child10"
], 
"45": [
    "child92", 
    "child45"
], 
"33": [
    "child38"
]
}

So in my understanding the numbers would be the corresponding "name":"5" and then my JSON format file would also have the children as "children". As what it is right now, my JSON format output doesn't run in the HTML code of the dendogram.

The expected outcome would be like this:

{
 "name": "flare",
     "children": [
  {
   "name": "5",
   "children": [
    {
     "name": "child10", "size": 5000},
     ]
    {
     "name": "45",
     "children": [
      {"name": "child92", "size": 3501},
      {"name": "child45", "size": 3567},
    ]
    },
     {
 "name": "33",
 "children": [
  {"name": "child38", "size": 8044}
 ]
}
}

Edit:

The answer of @martineau works, but it's not exactly what I want. I start with a defaultdict(list) and the desired output, as above should have the "children" as a list of dicts whereas with martineau kind answer, the "children" it's just a list. If anybody can add something to that to make it work it would be great. Don't worry about the "size" variable, this can be ignored for now.

martineau
  • 119,623
  • 25
  • 170
  • 301
HR123r
  • 181
  • 2
  • 10
  • could you post an example of the expected output format? – matthewatabet Jul 13 '15 at 19:17
  • so if you can please go to this link: http://bl.ocks.org/mbostock/raw/4063550/flare.json then you can see. But I will edit my question to include an expected outcome. Thank you for pointing this out. – HR123r Jul 13 '15 at 19:19
  • What's in my answer for "children" is a list of strings because that's what you have in `my_defaultdict` in your question. If you want it to be a list of dicts, then that is what needs to be put into `my_defaultdict` — which is why I've asked you a couple of times to show some real data (or fake data in the real data format) for it in your question. – martineau Jul 13 '15 at 22:13
  • @martineau: Hi, but the real data looks like that, the my_defaultdict. exactly as it : a defaultdict of lists : my_defaultdict = {5: ['child10'], 45: ['child92', 'child45'], 33:['child38']}. apologies if I didn't make it very clear from the beginning. This is the my data. Isn't this a defaultdict of lists? thanks. The keys are the numbers e.g. 5, 45, 33. The values are the corresponding lists of children. – HR123r Jul 14 '15 at 07:22
  • 1
    Understood...then the current version of my answer comes as close to your expected output as it can (it's been modified since you first accepted it). – martineau Jul 14 '15 at 08:58
  • @martineau: yes this works like charm, it also run on the HTML as a tree that I expect. Many thanks! – HR123r Jul 14 '15 at 19:11
  • Good to hear. For fun you might want try changing it to `[{'name': child, 'size': 3000} for child in v]` and see what happens. – martineau Jul 14 '15 at 20:00
  • @martineau: yes it works. I was wondering that what if there is a 'name': flare, how can I not create another branch as flare but just add that node to the flare node instead? So no branch is named 'flare'? Because in the my_defaultdict (not in the example I have given) I have a key named 'flare' and thus the tree has a branch named 'flare'. Would I have to have a check, before building my_dict, e.g. `if k is 'flare' update to name 'flare'`? Realised it later that was a problem, otherwise I wld mention it earlier, I apologise! But I would love to hear your thoughts around it, if possible – HR123r Jul 16 '15 at 06:20
  • You could change the `[{'name': k,` to `[{'name': k if k != 'flare' else 'flare_',`, If you wanted to get fancy and generate unique substitutes, you could `...else gen_name(),` and write a `gen_name()` function that returned one. – martineau Jul 16 '15 at 06:57

3 Answers3

6

You need to make a new dictionary from your defaultdict. The children in your example code is just a list of strings, so I don't know where the "size" of each one comes from so just changed it into a list of dicts (which don't have a an entry for a "size" key).

from collections import defaultdict
#import simplejson as json
import json  # using stdlib module instead

my_defaultdict = defaultdict(list, { 5: ['child10'],
                                    45: ['child92', 'child45'],
                                    33: ['child38']})

my_dict = {'name': 'flare',
           'children': [{'name': k,
                         'children': [{'name': child} for child in v]}
                            for k, v in my_defaultdict.items()]}

json_data = json.dumps(my_dict, indent=2)

print(json_data)

Output:

{
  "name": "flare",
  "children": [
    {
      "name": 33,
      "children": [
        {
          "name": "child38"
        }
      ]
    },
    {
      "name": 5,
      "children": [
        {
          "name": "child10"
        }
      ]
    },
    {
      "name": 45,
      "children": [
        {
          "name": "child92"
        },
        {
          "name": "child45"
        }
      ]
    }
  ]
}
martineau
  • 119,623
  • 25
  • 170
  • 301
  • 1
    I just hardcoded it to have some sample input that was a `defaultdict` and I based it on what you have in your question (which doesn't have a "size" for the children in it). You need to provide us with an example of data from one of your real defaultdict objects. – martineau Jul 13 '15 at 20:03
  • Hey! It works! Thank you! This is great. I realised the purpose of your line my_defaultdict = defaultdict(list, (("5", ["child10"]), ("45", ["child92", "child45"]), ("33", ["child38"]))) as well. I want to ask a quick question: json_data = json.dumps(my_dict, indent=2*' '). I get a TypeError: can't multiply sequence by non-int of type 'str'. Would you please analyse what you wrote? It seems to me as * (asteric) and quotation marks with a space in between. So to solve it I just left it as indent = 2. But I am very interest to know what you wrote there. Many thanks again! – HR123r Jul 13 '15 at 20:03
  • 1
    The `indent=2*' '` means 2 space characters (and it works fine for me and is standard Python). I did this because that's what `simplejson` wants. If you're using the built-in `json` module, it assumes space characters and just wants an integer value. – martineau Jul 13 '15 at 20:06
  • When you say the 'size' for the children, as with the http://bl.ocks.org/mbostock/raw/4063550/flare.json file, does the size correspond to the circle size in the dendogram? – HR123r Jul 13 '15 at 20:09
  • 1
    No, I mentioned it only because you show one in the "expected outcome" portion at the end of your question. – martineau Jul 13 '15 at 20:12
  • you are right. as I want to reproduce the dendogram from that HTML code, it would be good to include the 'size' variable. I presume that the size is the size of the circle in the dendogram? Please correct me if I am wrong. Again many thanks for all your help and time spent. I learned a lot! of stuff :) – HR123r Jul 13 '15 at 20:15
  • 1
    Seems like a reasonable assumption. but frankly I have no idea. – martineau Jul 13 '15 at 20:17
  • Can I ask you something last. Did you try running the HTML code with this JSON output at all? Did you get a dendogram? I am trying to open it both in Chrome and Internet explorer but I don't get anything. I will try it with firefox. The output of the JSON you gave me seems like it's the appropriate, ammending the HTML code, I would expect it to produce the dendogram.. – HR123r Jul 13 '15 at 20:25
  • No, I didn't try running the HTML code with this JSON output. If you provide the data from a real defaultdict you have, I might be able to update my answer with something that will work for you. – martineau Jul 13 '15 at 20:38
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/83153/discussion-between-hr123r-and-martineau). – HR123r Jul 13 '15 at 20:38
2

I solved by using this: How to convert defaultdict to dict?

For future people that may search for it. I achieved by transforming the defaultdict into a commom dictionary just calling:

b = defaultdict(dict)

a = dict(b)

Then the JSON could recognize this structure.

1

You need to build the dictionary so that it contains the desired 'children' fields. json.dumps does not output data in any predefined schema. Rather, the object passed to json.dumps must already adhere to any structure desired.

Try something like this:

my_defaultdict = {"name": "5",
                  "children":[ {"name": "child10", "children":[]}]}
print json.dumps(my_defaultdict)
matthewatabet
  • 1,463
  • 11
  • 26
  • Thank you for your answer, but the issue is that I can't change the way my_defaultdict is created, because it is created by other functions in my code and the data I am handling is huge and I can't go and change it manually. Plus the my_defaultdict will be built on the fly every time the code runs and thus the data in the my_defaultdict cannot be hardcoded. Everytime it will be different. :( – HR123r Jul 13 '15 at 19:29
  • why can't you change it manually? it's possible to write such logic. – matthewatabet Jul 13 '15 at 19:30
  • @mattewatabet: my_default dict is populated by the output of other function on the script. It is created on the fly, it is very big and I want an automation to the process as each time the script would run it would be different. Are you suggesting that I would have to find which elements in the my_defaultdict are the children to which parent and build a second_defaultdict with the structure you are suggesting? But then how do I go doing that? – HR123r Jul 13 '15 at 19:39