-1

How do I convert below comma delimited records -

COL1,COL2,COL3,COL4
A101,P501,U901,US_A
A101,P501,U902,US_B
A101,P502,U901,US_A
A102,P501,U901,US_A
A102,P502,U902,US_B

into python dictionary -

result = {
"A101": {
    "P501": {"U901": "US_A", "U902": "US_B"},
    "P502": {"U901": "US_A"}
},
"A102": {
    "P501": {"U901": "US_A"},
    "P502": {"U902": "US_B"}
}

}

Thank you for your help!

ASD
  • 69
  • 1
  • 6

2 Answers2

2

Approach

We can process the rows of the CSV file as follows:

Code

import csv

def csv_to_nested_dict(filenm):
    ' CSV file to nested dictionary '
    
    with open(filenm, 'r') as csvfile:
        csv_reader = csv.reader(csvfile, delimiter=',')
        next(csv_reader)                         # skip header row

        result = {}
        for row in csv_reader:
            # Convert row to nested dictionary and
            # Merge into result
            result = merge_dict(result, 
                                reduce(lambda x, y: {y: x}, reversed(row))) # row to nested dictionary
            
    return result

def merge_dict(dict1, dict2):
    ' Merges nested dictionaries '
    for key, val in dict1.items():
        if type(val) == dict:
            if key in dict2 and type(dict2[key] == dict):
                merge_dict(dict1[key], dict2[key])
        else:
            if key in dict2:
                dict1[key] = dict2[key]

    for key, val in dict2.items():
        if not key in dict1:
            dict1[key] = val

    return dict1

Test

Usage:

res = csv_to_nested_dict('test.txt')  # result

# Use json to pretty print nested dictionary res
import json
print(json.dumps(res, indent = 4))    

Input File test.txt

COL1,COL2,COL3,COL4
A101,P501,U901,US_A
A101,P501,U902,US_B
A101,P502,U901,US_A
A102,P501,U901,US_A
A102,P502,U902,US_B

Output

{
    "A101": {
        "P501": {
            "U901": "US_A",
            "U902": "US_B"
        },
        "P502": {
            "U901": "US_A"
        }
    },
    "A102": {
        "P501": {
            "U901": "US_A"
        },
        "P502": {
            "U902": "US_B"
        }
    }
}
DarrylG
  • 16,732
  • 2
  • 17
  • 23
0

here is simple version of solution -

def dict_reader(file_name):
with open(file_name, 'r') as csvfile:
    reader = csv.DictReader(csvfile)
    try:
        data = dict()
        for row in reader:
            col1, col2, col3, col4 = (row["col1"], row["col2"], row["col3"], row["col4"])
            if col1 in data:
                if col2 in data[col1]:
                    data[col1][col2].update({col3: col4})
                else:
                    data[col1][col2] = {col3: col4}
            else:
                data[col1] = {col2: {col3: col4}}
    except csv.Error as e:
        sys.exit('file {}, line {}: {}'.format(file_name, reader.line_num, e))
    finally:
        return data

it is not very elegant solution but works.

ASD
  • 69
  • 1
  • 6