-1

I have data.csv as follows using which i intend to create json for d3js sun burst visualization:

REGION_CODE,LOCATION_CODE,SKU_CODE,BASIC_SALES
Region 0,Location 10,SKU 500118,25
Region 0,Location 10,SKU 500122,34
Region 0,Location 11,SKU 500123,34
Region 0,Location 11,SKU 500124,68

I'm trying to convert it into a nested json as follows :

{
    'name': 'region 0',
    'shortName': 'region 0',
    'children': [
        {
            'name': 'location 10',
            'shortName': 'location 10',
            'children':[
                {
                    'name': 'SKU 500118',
                    'shortName': 'SKU 500118',
                    'size': '25'
                }, 
                {
                    'name': 'SKU 500122',
                    'shortName': 'SKU 500122',
                    'size': '34'
                }
            ]
        },
        {
            'name': 'location 11',
            'shortName': 'location 11',
            'children': [
                {
                    'name': 'SKU 500123',
                    'shortName': 'SKU 500123',
                    'size': '34'
                },
                {
                    'name': 'SKU 500124',
                    'shortName': 'SKU 500124',
                    'size': '68'
                }
            ]
        }
    ]
}

I found an almost similar solution on Stackoverflow, convert-csv-to-json-tree-structure but it does down till the last row and adds it as children, while i want the the second last row to be added as children and the last row to be added as size as shown above.

Saleem Ali
  • 1,363
  • 11
  • 21
Irfan Harun
  • 979
  • 2
  • 16
  • 37

1 Answers1

1

I looked at the similiar example and modified it according to your requirements.

The script assumes, that there is only one BASIC_SALES number for any tuple (REGION_CODE,LOCATION_CODE,SKU_CODE)

import csv
import json

# helper to create dict of dict of .. of values
def add_leaf(tree, row):
    key = row[0]
    if len(row) > 2:
        if not key in tree:
            tree[key] = {}
        add_leaf(tree[key], row[1:])
    if len(row) == 2:
            tree[key] = row[-1]

# transforms helper structure to final structure
def transform_tree(tree):
    res = []
    res_entry = []
    for key, val in tree.items():
        if isinstance(val, dict):
            res.append({
                "name": key,
                "shortName": key,
                "children": transform_tree(val),
                })
        else:
            res.append({
                "name": key,
                "shortName": key,
                "size": val,
                })
    return res
def main():
    """ The main thread composed from two parts.

    First it's parsing the csv file and builds a tree hierarchy from it.
    It uses the recursive function add_leaf for it.

    Second it's recursively transforms this tree structure into the
    desired hierarchy

    And the last part is just printing the result.

    """

    # Part1 create a hierarchival dict of dicts of dicts.
    # The leaf cells are however just the last element of each row
    tree = {}
    with open('test.csv') as csvfile:
        reader = csv.reader(csvfile)
        for rid, row in enumerate(reader):
            if rid == 0:  # skip header row
                continue
            add_leaf(tree, row)

    # uncomment to visualize the intermediate result
    # print(json.dumps(tree, indent=1))
    # print("=" * 76)

    # transfor the tree into the desired structure
    res = transform_tree(tree)
    print(json.dumps(res, indent=1))

# so let's roll
main()
gelonida
  • 5,327
  • 2
  • 23
  • 41