0

Having trouble with this one. Feel as though I've whipped through similar transformations of data, but this one is throwing me for a loop.

Looking to convert a spreadsheet data into a nested JSON that will be used for a D3 visualization, seen it referred to as "flare.json".

The target JSON would look something like this (where, obviously, a dictionary would be just as good):

{
  "name": "root",
  "children": [
    { 
      "name": "A1",
      "children": [
        { 
          "name": "A2",
          "children": [
            { 
              "name": "A3",
              "children": [
                { 
                  "name": "A4",
                  "children": [
                  ] 
                }
              ]
            }
          ] 
        }
      ]
    },
    { 
      "name": "B1",
      "children": [
        { 
          "name": "B2",
          "children": [
            { 
              "name": "B3",
              "children": [
                { 
                  "name": "B4",
                  "children": [
                  ] 
                }
              ]
            }
          ] 
        }
      ]
    }  
  ]
}

I'm pulling data from a spreadsheet with openpyxl that provides a root tuple that contains tuples of each column value.

e.g.

(
  ('A1','A2','A3','A4'),
  ('B1','B2','B3','B4'),
)

I know there are 101 different ways to do this, considered using dataframes from pandas, and I'm sure openpyxl has a myriad of methods and conversions for this kind of thing. But for whatever reason, having a hard time envisioning this process today. Thanks in advance.

ghukill
  • 1,136
  • 17
  • 42
  • Possible duplicate of [Convert pandas DataFrame to a nested dict](http://stackoverflow.com/questions/19798112/convert-pandas-dataframe-to-a-nested-dict) – Ami Tavory Aug 03 '16 at 20:20

1 Answers1

1

This should do it:

def convert_data(data):                                     
    out = {'name': 'root', 'children': []}           

    for row in data:                                 
        out['children'].append({})                   
        current = out['children']                    
        for value in row:                            
            current[-1]['name'] = value              
            current[-1]['children'] = [{}]           

            current = current[-1]['children']        

    return out                                       

data = (('A1','A2','A3','A4'), ('B1','B2','B3','B4'))

new_structure = convert_data(data)

You can obviously use something like json.dumps to output it as a JSON string.

OdinX
  • 4,135
  • 1
  • 24
  • 33
  • THANK YOU. This is awesome, and works perfectly. For my own sanity, it's nice to know I was close, but clearly not quite there. Much appreciated. – ghukill Aug 03 '16 at 20:57