1

So guys i have the following data:

import pandas as pd
a = [[0, 4, 'a', 1], [0, 4, 'b', 1], [0, 2, 'a', 0.5], [0, 2, 'b', -1], [0, 2, 'c', 1]]
df = pd.DataFrame(a)
df.columns = ['x', 'y', 'features', 'values']

   x  y features  values
0  0  4        a     1.0
1  0  4        b     1.0
2  0  2        a     0.5
3  0  2        b    -1.0
4  0  2        c     1.0

I want a new dataframe that looks like this:

b = [[0, 4, 1, 1, 0], [0, 2, 0.5, -1, 1]]
df2 = pd.DataFrame(b)
df2.columns = ['x', 'y', 'a', 'b', 'c']

   x  y    a  b  c
0  0  4  1.0  1  0
1  0  2  0.5 -1  1

So I have to groupby(['x','y']) but then how I can continue? I want that the values in feature columns become new columns, ant then I use the values in 'values' col to fill these columns. please help me.

Salsa94
  • 45
  • 5

2 Answers2

1

Use DataFrame.pivot_table

new_df = df.pivot_table(index=['x', 'y'],
                        columns='features', 
                        values='values', 
                        fill_value=0)\
           .reset_index()\
           .rename_axis(columns=None)
print(new_df)

   x  y    a  b  c
0  0  2  0.5 -1  1
1  0  4  1.0  1  0

Or without DataFrame.pivot_table:

new_df = df.set_index(['x', 'y', 'features'])\
           .unstack('features', fill_value=0)\
           .reset_index()\
           .droplevel(0, axis=1)\
           .rename_axis(columns=None)
print(new_df)
ansev
  • 30,322
  • 5
  • 17
  • 31
1

Another possible solution, based on pandas.DataFrame.pivot:

(df.pivot(index=['x', 'y'], columns='features')
 .droplevel(0, axis=1).rename_axis(None, axis=1)
 .reset_index().fillna(0))

Alternatively, to be more concise, we can use janitor.pivot_wider:

# pip install pyjanitor
from janitor import pivot_wider

(pivot_wider(df, index=['x', 'y'], names_from='features', values_from='values')
 .fillna(0))

Output:

   x  y    a    b    c
0  0  2  0.5 -1.0  1.0
1  0  4  1.0  1.0  0.0
PaulS
  • 21,159
  • 2
  • 9
  • 26