Python, should I save to one csv file or several csv files a 4 levels nesting structure, a: list of dictionaries of lists of dictionaries?

Question

I have this structure below, my question is what would be the best way to put all of it... one or several csv file(s)?, I mean should I try and put all in one single file or should I try to split it in several files?. And if you have any suggestions as to how to do it.

I am learning to work with csv files, I have started to work with containers that have 2 levels, I guess I can say I know how to do a list of dictionaries and pass it to csv, and a dictionary of lists and pass that to csv, and also a list of lists and a dictionary of dictionaries...

But this thing has 4 levels.

It is a list of dictionaries, which in turn are lists of dictionaries.

data= [
    {
        "name":None,
        "age":None,
        "city":None,
        "score":0,
        "attempts":0,
        "collection":[
            {
                'title':None,
                'artist':None,
                'genre':None,
                'year':None,
                'guessed':0
            },
            {
                'title':None,
                'artist':None,
                'genre':None,
                'year':None,
                'guessed':0
            },
            {
                'title':None,
                'artist':None,
                'genre':None,
                'year':None,
                'guessed':0
            }
        ]
     },
    {
        "name": None,
        "age": None,
        "city": None,
        "score": 0,
        "attempts": 0,
        "collection": [
            {
                'title': None,
                'artist': None,
                'genre': None,
                'year': None,
                'guessed': 0
            },
            {
                'title': None,
                'artist': None,
                'genre': None,
                'year': None,
                'guessed': 0
            },
            {
                'title': None,
                'artist': None,
                'genre': None,
                'year': None,
                'guessed': 0
            }
        ]
    }
]

What is the desired use of the file? To be read back into a python project? TO be used somewhere else? Is it needed to be in csv? This looks to be a good use for JSON. If you are reading back into python a pickle might be the best approach. — user1558604, Nov 29 '19 at 23:50
To be read back into a python project.Yes I was thinking that maybe it would be best with JSON? You think JSON can handle this 4 levels better than the csv module???? — tucooperacion, Nov 29 '19 at 23:55
Yes, CSV is good for tabular data. The data you have is not particularly tabular. JSON doesn't have that limitation. — user1558604, Nov 30 '19 at 00:03

score 3 · Answer 1 · answered Nov 30 '19 at 00:29

There are a couple of obvious options.

Store the list of dicts into a JSON file as shown Python: converting a list of dictionaries to json
Work with data in pandas and then save it to a single csv file.
- This will make any type of analysis and visualization far easier than any other option.

Given the following data:

data = [{
        "name": "A",
        "age": 30,
        "city": "B",
        "score": 10,
        "attempts": 10,
        "collection": [{
                'title': "X",
                'artist': None,
                'genre': None,
                'year': None,
                'guessed': 0
            }, {
                'title': "Y",
                'artist': None,
                'genre': None,
                'year': None,
                'guessed': 0
            }, {
                'title': "Z",
                'artist': None,
                'genre': None,
                'year': None,
                'guessed': 0
            }
        ]
    }, {
        "name": "C",
        "age": 40,
        "city": "D",
        "score": 20,
        "attempts": 30,
        "collection": [{
                'title': "L",
                'artist': None,
                'genre': None,
                'year': None,
                'guessed': 0
            }, {
                'title': "M",
                'artist': None,
                'genre': None,
                'year': None,
                'guessed': 0
            }, {
                'title': "N",
                'artist': None,
                'genre': None,
                'year': None,
                'guessed': 0
            }, {
                'title': "O",
                'artist': None,
                'genre': None,
                'year': None,
                'guessed': 0
            }
        ]
    }
]

Load into pandas with `json_normalize`

import pandas as pd
from pandas.io.json import json_normalize

df = json_normalize(data, 'collection', ['name', 'age', 'city', 'score', 'attempts'])

# df view
  title artist genre  year  guessed name age city score attempts
0     X   None  None  None        0    A  30    B    10       10
1     Y   None  None  None        0    A  30    B    10       10
2     Z   None  None  None        0    A  30    B    10       10
3     L   None  None  None        0    C  40    D    20       30
4     M   None  None  None        0    C  40    D    20       30
5     N   None  None  None        0    C  40    D    20       30
6     O   None  None  None        0    C  40    D    20       30

# save to csv
df.to_csv('my_file.csv', index=False)

# reload from csv
df = pd.read_csv('my_file.csv')

Python, should I save to one csv file or several csv files a 4 levels nesting structure, a: list of dictionaries of lists of dictionaries?

1 Answers1

There are a couple of obvious options.

Given the following data:

Load into pandas with json_normalize

Load into pandas with `json_normalize`