1

I have a set of python dictionaries that I have obtained by means of a for loop. I am trying to have these added to Pandas Dataframe.

Output for a variable called output

{'name':'Kevin','age':21}
{'name':'Steve','age':31}
{'name':'Mark','age':11}

I am trying to append each of these dictionary into a single Dataframe. I tried to perform the below but it just added the first row.

df = pd.DataFrame(output)

Could anyone advice as to where am going wrong and have all the dictionaries added to the Dataframe.

Update on the loop statement

The below code helps to read xml and convert it to a dataframe. Right now I see I am able to loop in through multiple xml files and created dictionaries for each xml file. I am trying to see how could I add each of these dictionaries to a single Dataframe:

def f(elem, result):
    result[elem.tag] = elem.text
    cs = elem.getchildren()
    for c in cs:
        result = f(c, result)
    return result

    result = {}
    for file in allFiles:
        tree = ET.parse(file)
        root = tree.getroot()
        result = f(root, result)
        print(result)

2 Answers2

3

You can append each dictionary to list and last call DataFrame constructor:

out = []
for file in allFiles:
    tree = ET.parse(file)
    root = tree.getroot()
    result = f(root, result)
    out.append(result)
        
df = pd.DataFrame(out)
Guru
  • 2,331
  • 6
  • 31
  • 48
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
1

We can add these dicts to a list:

ds = []
for ...:      # your loop
    ds += [d] # where d is one of the dicts

When we have the list of dicts, we can simply use pd.DataFrame on that list:

ds = [
    {'name':'Kevin','age':21},
    {'name':'Steve','age':31},
    {'name':'Mark','age':11}
]
pd.DataFrame(ds)

Output:

    name  age
0  Kevin   21
1  Steve   31
2   Mark   11

Update: And it's not a problem if different dicts have different keys, e.g.:

ds = [
    {'name':'Kevin','age':21},
    {'name':'Steve','age':31,'location': 'NY'},
    {'name':'Mark','age':11,'favorite_food': 'pizza'}
]
pd.DataFrame(ds)

Output:

   age favorite_food location   name
0   21           NaN      NaN  Kevin
1   31           NaN       NY  Steve
2   11         pizza      NaN   Mark

Update 2: Building up on our previous discussion in Python - Converting xml to csv using Python pandas we can do:

results = []
for file in glob.glob('*.xml'):
    tree = ET.parse(file)
    root = tree.getroot()
    result = f(root, {})
    result['filename'] = file # added filename to our results
    results += [result]

pd.DataFrame(results)
perl
  • 9,826
  • 1
  • 10
  • 22
  • thanks for the help. I am continuing from the other ticket where you helped me build the xml to csv convertor. Since it was a new issue I raised a new ticket on this. I have multiple dictionaries from each of the xml file but now trying to add them to Dataframe – coding learner Mar 06 '19 at 12:02
  • I tried to add all dicts to a list by doing the below `ds = [] for i in result: ds += [i] print(ds)` The above converts the dicts to a list but not dicts inside a list as shown below. Sample output: `['name':'Kevin','age':21]` `['name':'Steve','age':31]` `['name':'Mark','age':11]` – coding learner Mar 06 '19 at 12:17
  • OK, so let me clarify it a bit, on each step we get a dict like {'name': 'Kevin', ...}, right? – perl Mar 06 '19 at 12:44
  • 1
    OK, great, can you check the code from `Update 2` please? – perl Mar 06 '19 at 12:51
  • Added filename to the results with `result['filename'] = file # added filename to our results` – perl Mar 06 '19 at 14:24