in python, read csv data and. convert unique values into python's dictionary

Question

How do I convert below comma delimited records -

COL1,COL2,COL3,COL4
A101,P501,U901,US_A
A101,P501,U902,US_B
A101,P502,U901,US_A
A102,P501,U901,US_A
A102,P502,U902,US_B

into python dictionary -

result = {
"A101": {
    "P501": {"U901": "US_A", "U902": "US_B"},
    "P502": {"U901": "US_A"}
},
"A102": {
    "P501": {"U901": "US_A"},
    "P502": {"U902": "US_B"}
}

}

Thank you for your help!

DarrylG · Answer 1 · 2022-09-24T14:36:29.273

Approach

We can process the rows of the CSV file as follows:

Convert each row in CSV file from a list to a nested dictionary using Convert a list to nested dictionary i.e. line reduce(lambda x, y: {y: x}, reversed(row))) in code below.
Merge the nested dictionaries using Merge nested dictionaries in Python using merge_dict function below

Code

import csv

def csv_to_nested_dict(filenm):
    ' CSV file to nested dictionary '
    
    with open(filenm, 'r') as csvfile:
        csv_reader = csv.reader(csvfile, delimiter=',')
        next(csv_reader)                         # skip header row

        result = {}
        for row in csv_reader:
            # Convert row to nested dictionary and
            # Merge into result
            result = merge_dict(result, 
                                reduce(lambda x, y: {y: x}, reversed(row))) # row to nested dictionary
            
    return result

def merge_dict(dict1, dict2):
    ' Merges nested dictionaries '
    for key, val in dict1.items():
        if type(val) == dict:
            if key in dict2 and type(dict2[key] == dict):
                merge_dict(dict1[key], dict2[key])
        else:
            if key in dict2:
                dict1[key] = dict2[key]

    for key, val in dict2.items():
        if not key in dict1:
            dict1[key] = val

    return dict1

Test

Usage:

res = csv_to_nested_dict('test.txt')  # result

# Use json to pretty print nested dictionary res
import json
print(json.dumps(res, indent = 4))

Input File test.txt

COL1,COL2,COL3,COL4
A101,P501,U901,US_A
A101,P501,U902,US_B
A101,P502,U901,US_A
A102,P501,U901,US_A
A102,P502,U902,US_B

Output

{
    "A101": {
        "P501": {
            "U901": "US_A",
            "U902": "US_B"
        },
        "P502": {
            "U901": "US_A"
        }
    },
    "A102": {
        "P501": {
            "U901": "US_A"
        },
        "P502": {
            "U902": "US_B"
        }
    }
}

score 0 · Answer 2 · answered Sep 25 '22 at 21:30

here is simple version of solution -

def dict_reader(file_name):
with open(file_name, 'r') as csvfile:
    reader = csv.DictReader(csvfile)
    try:
        data = dict()
        for row in reader:
            col1, col2, col3, col4 = (row["col1"], row["col2"], row["col3"], row["col4"])
            if col1 in data:
                if col2 in data[col1]:
                    data[col1][col2].update({col3: col4})
                else:
                    data[col1][col2] = {col3: col4}
            else:
                data[col1] = {col2: {col3: col4}}
    except csv.Error as e:
        sys.exit('file {}, line {}: {}'.format(file_name, reader.line_num, e))
    finally:
        return data

it is not very elegant solution but works.

in python, read csv data and. convert unique values into python's dictionary

2 Answers2