0

I have this structure below, my question is what would be the best way to put all of it... one or several csv file(s)?, I mean should I try and put all in one single file or should I try to split it in several files?. And if you have any suggestions as to how to do it.

I am learning to work with csv files, I have started to work with containers that have 2 levels, I guess I can say I know how to do a list of dictionaries and pass it to csv, and a dictionary of lists and pass that to csv, and also a list of lists and a dictionary of dictionaries...

But this thing has 4 levels.

It is a list of dictionaries, which in turn are lists of dictionaries.

data= [
    {
        "name":None,
        "age":None,
        "city":None,
        "score":0,
        "attempts":0,
        "collection":[
            {
                'title':None,
                'artist':None,
                'genre':None,
                'year':None,
                'guessed':0
            },
            {
                'title':None,
                'artist':None,
                'genre':None,
                'year':None,
                'guessed':0
            },
            {
                'title':None,
                'artist':None,
                'genre':None,
                'year':None,
                'guessed':0
            }
        ]
     },
    {
        "name": None,
        "age": None,
        "city": None,
        "score": 0,
        "attempts": 0,
        "collection": [
            {
                'title': None,
                'artist': None,
                'genre': None,
                'year': None,
                'guessed': 0
            },
            {
                'title': None,
                'artist': None,
                'genre': None,
                'year': None,
                'guessed': 0
            },
            {
                'title': None,
                'artist': None,
                'genre': None,
                'year': None,
                'guessed': 0
            }
        ]
    }
]
Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
  • 4
    What is the desired use of the file? To be read back into a python project? TO be used somewhere else? Is it needed to be in csv? This looks to be a good use for JSON. If you are reading back into python a pickle might be the best approach. – user1558604 Nov 29 '19 at 23:50
  • To be read back into a python project.Yes I was thinking that maybe it would be best with JSON? You think JSON can handle this 4 levels better than the csv module???? – tucooperacion Nov 29 '19 at 23:55
  • 1
    user1558604 : Thanks I will try JSON – tucooperacion Nov 30 '19 at 00:01
  • Yes, CSV is good for tabular data. The data you have is not particularly tabular. JSON doesn't have that limitation. – user1558604 Nov 30 '19 at 00:03
  • user1558604, Thanks a lot indeed! – tucooperacion Nov 30 '19 at 00:08

1 Answers1

3

There are a couple of obvious options.

Given the following data:

data = [{
        "name": "A",
        "age": 30,
        "city": "B",
        "score": 10,
        "attempts": 10,
        "collection": [{
                'title': "X",
                'artist': None,
                'genre': None,
                'year': None,
                'guessed': 0
            }, {
                'title': "Y",
                'artist': None,
                'genre': None,
                'year': None,
                'guessed': 0
            }, {
                'title': "Z",
                'artist': None,
                'genre': None,
                'year': None,
                'guessed': 0
            }
        ]
    }, {
        "name": "C",
        "age": 40,
        "city": "D",
        "score": 20,
        "attempts": 30,
        "collection": [{
                'title': "L",
                'artist': None,
                'genre': None,
                'year': None,
                'guessed': 0
            }, {
                'title': "M",
                'artist': None,
                'genre': None,
                'year': None,
                'guessed': 0
            }, {
                'title': "N",
                'artist': None,
                'genre': None,
                'year': None,
                'guessed': 0
            }, {
                'title': "O",
                'artist': None,
                'genre': None,
                'year': None,
                'guessed': 0
            }
        ]
    }
]

Load into pandas with json_normalize

import pandas as pd
from pandas.io.json import json_normalize

df = json_normalize(data, 'collection', ['name', 'age', 'city', 'score', 'attempts'])

# df view
  title artist genre  year  guessed name age city score attempts
0     X   None  None  None        0    A  30    B    10       10
1     Y   None  None  None        0    A  30    B    10       10
2     Z   None  None  None        0    A  30    B    10       10
3     L   None  None  None        0    C  40    D    20       30
4     M   None  None  None        0    C  40    D    20       30
5     N   None  None  None        0    C  40    D    20       30
6     O   None  None  None        0    C  40    D    20       30

# save to csv
df.to_csv('my_file.csv', index=False)

# reload from csv
df = pd.read_csv('my_file.csv')
Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158