-2

I've had a look around but no other thread seems to quite answer the specific challenge I'm facing.

For example, this thread tells me how to write a list of dictionaries to CSV.

This one explains how to write a dictionary when the value for each key is a list.

I have a list of dictionaries that I need to write to CSV where only one of the values is a list. For example:

[{
    'name': 'name_1', 
    'id': 'id_1', 
    'info': [{
        'info_1': 'some info',
        'info_2': 'more info'
        },
        {
        'info_1': 'all the info',
        'info_2': 'extra info'
    }]
 },
 {    
    'name': 'name_2', 
    'id': 'id_2', 
    'info': [{
        'info_1': 'another piece of info the same type as info_1 above',
        'info_2': 'info'
        },
        {
        'info_1': 'getting tedious',
        'info_2': 'you get the picture...'
        }
    ]
}]

Output headers are: name, id, info_1, info_2

Nimantha
  • 6,405
  • 6
  • 28
  • 69

2 Answers2

1

This type of list of the dictionary is called Nested JSON; it is better to handle this data type to CSV with Pandas method json_normalize :

import pandas as pd

test = [{
    'name': 'name_1', 
    'id': 'id_1', 
    'info': [{
        'info_1': 'some info',
        'info_2': 'more info'
        },
        {
        'info_1': 'all the info',
        'info_2': 'extra info'
    }]
 },
 {    
    'name': 'name_2', 
    'id': 'id_2', 
    'info': [{
        'info_1': 'another piece of info the same type as info_1 above',
        'info_2': 'info'
        },
        {
        'info_1': 'getting tedious',
        'info_2': 'you get the picture...'
        }
    ]
}]

df = pd.json_normalize(test, 'info', ['id', 'name'], 
                    record_prefix='information_')

df.to_csv('information.csv')

final result(screenshot): Office CSV reader

*: it is better to use Pandas for this problem instead of CSV standard library because if you have a large file, 2 iteration to handle this problem has much time processing.

Bahram Jannesar
  • 109
  • 1
  • 7
  • Ahh, this is perfect! I knew about the pandas json_normalize function but didn't know how to handle the additional nested values. Thanks. – cutting_shapes Jun 21 '21 at 11:38
  • check this course, something interesting about nested JSON, [Data Camp Nested JSON](https://campus.datacamp.com/courses/streamlined-data-ingestion-with-pandas/importing-json-data-and-working-with-apis?ex=1) – Bahram Jannesar Jun 21 '21 at 11:45
0

To create a CSV file from the provided list, you can use this example:

import csv

lst = [
    {
        "name": "name_1",
        "id": "id_1",
        "info": [
            {"info_1": "some info", "info_2": "more info"},
            {"info_1": "all the info", "info_2": "extra info"},
        ],
    },
    {
        "name": "name_2",
        "id": "id_2",
        "info": [
            {
                "info_1": "another piece of info the same type as info_1 above",
                "info_2": "info",
            },
            {"info_1": "getting tedious", "info_2": "you get the picture..."},
        ],
    },
]
with open("data.csv", "w") as f_out:
    writer = csv.writer(f_out)
    writer.writerow(["name", "id", "info_1", "info_2"])
    for d in lst:
        for i in d["info"]:
            writer.writerow([d["name"], d["id"], i["info_1"], i["info_2"]])

Creates data.csv (screenshot from LibreOffice):

enter image description here

Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91