1

Suppose I have the following array in python:

[
    {'id': [1,2,3]},
    {'name': [4,3,2]},
    {'age': [9,0,1]},
]

How would I load this into a pandas dataframe? Usually I do pd.DataFrame from a dict, but it's important for me to maintain the column order.

The final data should look like this:

id     name       age
1      4          9
2      3          0
3      2          1

3 Answers3

1

You can construct a single dictionary and then feed to pd.DataFrame. To guarantee column ordering is preserved, use collections.OrderedDict:

from collections import OrderedDict

L = [{'id': [1,2,3]},
     {'name': [4,3,2]},
     {'age': [9,0,1]}]

df = pd.DataFrame(OrderedDict([(k, v) for d in L for k, v in d.items()]))

print(df)

   id  name  age
0   1     4    9
1   2     3    0
2   3     2    1

With Python 3.7+ dictionaries are insertion ordered, so you can use a regular dict:

df = pd.DataFrame({k: v for d in L for k, v in d.items()})
jpp
  • 159,742
  • 34
  • 281
  • 339
1

Or merge the list of dictionaries (source) and convert the result to a dataframe:

merged_data = {}

[merged_data.update(d) for d in original_data]    
# or, b/c it's more pythonic:
# list(map(lambda x: merged_data.update(x), original_data))

df = pd.DataFrame.from_dict(merged_data)
df = df[['id', 'name', 'age']]

print(df)

# id  name  age
# 0   1     4    9
# 1   2     3    0
# 2   3     2    1

For me it's more clear and readable.

Darius
  • 10,762
  • 2
  • 29
  • 50
  • 1
    how do you know that the `dict` will preserve ordering? Is that an OrderedDict, or ? –  Dec 27 '18 at 03:42
  • Thanks, I misunderstood your question. Nevertheless did you think about [reordering the columns](https://stackoverflow.com/a/41968766/1293700)? – Darius Dec 27 '18 at 03:51
  • I completed the example. – Darius Dec 27 '18 at 04:00
0

A little hacky, but does

pd.concat([pd.DataFrame(d_) for d_ in d], axis=1)

work?

(assuming

d = your_list

)

hchw
  • 1,388
  • 8
  • 14
  • that works, thanks. Could you please explain what `axis=1` does? –  Dec 27 '18 at 03:29
  • 1
    sure! `axis=0` is the default, and will concatenate dataframes vertically (ie if you have two dataframes of three columns and three rows, `axis=0` will make one with three columns and six rows. `axis=1` is the opposite, concatenates them horizontally, so you would have 6 columns and 3 rows. Since in this case you are creating 3 new dataframes of 3 rows and 1 column, to get the resulting 3x3 df you need to use `axis=1`. (more infor here: https://pandas.pydata.org/pandas-docs/version/0.23.4/generated/pandas.concat.html) – hchw Dec 27 '18 at 03:32