How to flatten pandas dataframe

Question

Here is my pandas dataframe, and I would like to flatten. How can I do that ?

The input I have

key column
1 {'health_1': 45, 'health_2': 60, 'health_3': 34, 'health_4': 60, 'name': 'Tom'}   
2 {'health_1': 28, 'health_2': 10, 'health_3': 42, 'health_4': 07, 'name': 'John'}  
3 {'health_1': 86, 'health_2': 65, 'health_3': 14, 'health_4': 52, 'name': 'Adam'}

The expected output

All the health and name will become a column name of their own with their corresponding values. In no particular order.

health_1 health_2 health_3 health_4 name key
45          60       34       60    Tom  1
28          10       42       07    John 2
86          65       14       52    Adam 3

Please show the expected output. Do you want e.g. 4 rows (health_...) from each source row? — Valdi_Bo, Dec 05 '18 at 14:38
@Valdi_Bo not sure if I understood you correctly, basically every row has 5 columns. If that helps you — PolarBear10, Dec 05 '18 at 14:44

A l w a y s S u n n y · Accepted Answer · 2018-12-05T14:52:48.390

You can do it with one line solution,

df_expected = pd.concat([df, df['column'].apply(pd.Series)], axis = 1).drop('column', axis = 1)

Full version:

import pandas as pd
df = pd.DataFrame({"column":[
{'health_1': 45, 'health_2': 60, 'health_3': 34, 'health_4': 60, 'name': 'Tom'}   ,
{'health_1': 28, 'health_2': 10, 'health_3': 42, 'health_4': 7, 'name': 'John'}  ,
{'health_1': 86, 'health_2': 65, 'health_3': 14, 'health_4': 52, 'name': 'Adam'}
]})

df_expected = pd.concat([df, df['column'].apply(pd.Series)], axis = 1).drop('column', axis = 1)
print(df_expected)

DEMO: https://repl.it/repls/ButteryFrightenedFtpclient

score 4 · Answer 2 · answered Dec 05 '18 at 15:24

4

This should work:

df['column'].apply(pd.Series)

Gives:

   health_1  health_2  health_3  health_4  name
0  45        60        34        60        Tom 
1  28        10        42        7         John
2  86        65        14        52        Adam

answered Dec 05 '18 at 15:24

BhishanPoudel

15,974
21
108
169

score 2 · Answer 3 · answered Dec 05 '18 at 14:51

Try:

pd.concat([pd.DataFrame(i, index=[0]) for i in df.column], ignore_index=True)

Output:

   health_1  health_2  health_3  health_4  name
0        45        60        34        60   Tom
1        28        10        42         7  John
2        86        65        14        52  Adam

user3483203 · Answer 4 · 2018-12-05T16:13:33.837

The solutions using apply are going overboard. You can create your desired DataFrame using a list of dictionaries like you have in your column Series. You can easily get this list of dictionaries by using the tolist method:

res = pd.concat([df.key, pd.DataFrame(df.column.tolist())], axis=1)
print(res)

   key  health_1  health_2  health_3  health_4  name
0    1        45        60        34        60   Tom
1    2        28        10        42         7  John
2    3        86        65        14        52  Adam

score 0 · Answer 5 · answered Dec 05 '18 at 14:35

0

Not sure I understand - This is the default format for a DataFrame?

import pandas as pd
df = pd.DataFrame([
{'health_1': 45, 'health_2': 60, 'health_3': 34, 'health_4': 60, 'name': 'Tom'}   ,
{'health_1': 28, 'health_2': 10, 'health_3': 42, 'health_4': 7, 'name': 'John'}  ,
{'health_1': 86, 'health_2': 65, 'health_3': 14, 'health_4': 52, 'name': 'Adam'}
])

answered Dec 05 '18 at 14:35

Rich

3,640
3
20
24

your answer is how I wanted to look like, but unfortunately, I have the columns nested in the row with their values. This is data I get from a really bad server, that I need to convert to a pandas dataframe – PolarBear10 Dec 05 '18 at 14:37
the one I am presenting above is what I have. A dataframe with 2 columns, key and column. I would like to unpack the rows, so that each key in the row becomes a column in itself – PolarBear10 Dec 05 '18 at 14:38

How to flatten pandas dataframe

5 Answers5