6

Here is my pandas dataframe, and I would like to flatten. How can I do that ?

The input I have

key column
1 {'health_1': 45, 'health_2': 60, 'health_3': 34, 'health_4': 60, 'name': 'Tom'}   
2 {'health_1': 28, 'health_2': 10, 'health_3': 42, 'health_4': 07, 'name': 'John'}  
3 {'health_1': 86, 'health_2': 65, 'health_3': 14, 'health_4': 52, 'name': 'Adam'}

The expected output

All the health and name will become a column name of their own with their corresponding values. In no particular order.

health_1 health_2 health_3 health_4 name key
45          60       34       60    Tom  1
28          10       42       07    John 2
86          65       14       52    Adam 3
PolarBear10
  • 2,065
  • 7
  • 24
  • 55

5 Answers5

6

You can do it with one line solution,

df_expected = pd.concat([df, df['column'].apply(pd.Series)], axis = 1).drop('column', axis = 1)

Full version:

import pandas as pd
df = pd.DataFrame({"column":[
{'health_1': 45, 'health_2': 60, 'health_3': 34, 'health_4': 60, 'name': 'Tom'}   ,
{'health_1': 28, 'health_2': 10, 'health_3': 42, 'health_4': 7, 'name': 'John'}  ,
{'health_1': 86, 'health_2': 65, 'health_3': 14, 'health_4': 52, 'name': 'Adam'}
]})

df_expected = pd.concat([df, df['column'].apply(pd.Series)], axis = 1).drop('column', axis = 1)
print(df_expected)

DEMO: https://repl.it/repls/ButteryFrightenedFtpclient

A l w a y s S u n n y
  • 36,497
  • 8
  • 60
  • 103
4

This should work:

df['column'].apply(pd.Series)

Gives:

   health_1  health_2  health_3  health_4  name
0  45        60        34        60        Tom 
1  28        10        42        7         John
2  86        65        14        52        Adam
BhishanPoudel
  • 15,974
  • 21
  • 108
  • 169
2

Try:

pd.concat([pd.DataFrame(i, index=[0]) for i in df.column], ignore_index=True)

Output:

   health_1  health_2  health_3  health_4  name
0        45        60        34        60   Tom
1        28        10        42         7  John
2        86        65        14        52  Adam
Scott Boston
  • 147,308
  • 15
  • 139
  • 187
2

The solutions using apply are going overboard. You can create your desired DataFrame using a list of dictionaries like you have in your column Series. You can easily get this list of dictionaries by using the tolist method:

res = pd.concat([df.key, pd.DataFrame(df.column.tolist())], axis=1)
print(res)

   key  health_1  health_2  health_3  health_4  name
0    1        45        60        34        60   Tom
1    2        28        10        42         7  John
2    3        86        65        14        52  Adam
user3483203
  • 50,081
  • 9
  • 65
  • 94
0

Not sure I understand - This is the default format for a DataFrame?

import pandas as pd
df = pd.DataFrame([
{'health_1': 45, 'health_2': 60, 'health_3': 34, 'health_4': 60, 'name': 'Tom'}   ,
{'health_1': 28, 'health_2': 10, 'health_3': 42, 'health_4': 7, 'name': 'John'}  ,
{'health_1': 86, 'health_2': 65, 'health_3': 14, 'health_4': 52, 'name': 'Adam'}
])
Rich
  • 3,640
  • 3
  • 20
  • 24
  • your answer is how I wanted to look like, but unfortunately, I have the columns nested in the row with their values. This is data I get from a really bad server, that I need to convert to a pandas dataframe – PolarBear10 Dec 05 '18 at 14:37
  • the one I am presenting above is what I have. A dataframe with 2 columns, key and column. I would like to unpack the rows, so that each key in the row becomes a column in itself – PolarBear10 Dec 05 '18 at 14:38