Reshaping pandas dataframe with unstack()

Question

I am trying to reshape pandas DataFrame so that one of the columns would be unstacked to 'broader'. Once I proceed with unstack() new column levels occure but I seem to be unable to re-arrange the headers the way I want.

Firstly, I have following df:

from pandas import *

fList = [['Packs', 'Brablik', 'Holesovice', '2017', 100],
         ['Decorations', 'Drapp-design', 'Holesovice', '2017', 150],
         ['Decorations', 'Klapetkovi', 'Holesovice', '2017', 200],
         ['Decorations', 'Lezecké dárky', 'Fler', '2017', 100],
         ['Decorations', 'PP', 'Other', '2017', 350],
         ['Decorations', 'Pavlimila', 'Akce', '2017', 20],
         ['Decorations', 'Pavlimila', 'Holesovice', '2017', 50],
         ['Decorations', 'Wiccare', 'Holesovice', '2017', 70],
         ['Toys', 'Klára Vágnerová', 'Holesovice', '2017', 100],
         ['Toys', 'Lucie Polonyiová', 'Holesovice', '2017', 80],
         ['Dresses', 'PP', 'Other', '2018', 200]]

df = DataFrame(fList, columns = ['Section', 'Seller', 'Store', 'Selected_period', 'Total_pieces'])

This produces:

Consequently I reshape it like:

df = df.set_index(['Section', 'Seller', 'Store', 'Selected_period']).unstack(level = -1)
df = df.fillna(0)
df.columns = df.columns.droplevel(0)

That outputs:

However, I would like to have just following columns in the final dataframe: Section, Seller, Store, 2017, 2018. I still fail to re-arrange it so that I would get the output I want, despite I tried to adopt solutions posted here and here and here. Any suggestions?

The output I am after is the last posted but with different columns, there must be first column Section, second column Seller, third column Store, fourth column 2018, fifth column 2018.. — New2coding, Mar 21 '18 at 10:42

foglerit · Accepted Answer · 2018-03-21T11:12:20.627

3

If I understand correctly, you seem to just be missing a reset_index() call. Try this:

df = df.set_index(['Section', 'Seller', 'Store', 'Selected_period']).unstack(level = -1).fillna(0)
df.columns = df.columns.droplevel(0).rename('')
df = df.reset_index()

edited Mar 21 '18 at 11:12

answered Mar 21 '18 at 10:55

foglerit

7,792
8
44
64

Is there any way to get some kind of concatenated column names for 2017 and 2018 so that they would be named "Total_pieces: 2017" and "Total_pieces: 2018"? – New2coding Mar 22 '18 at 13:09
1

Yes. Not very elegant, but in the solution above you can replace `df.columns.droplevel(0).rename('')` by `df.columns.get_level_values(0) + ': ' + df.columns.get_level_values(1)` – foglerit Mar 23 '18 at 11:29
Thank you! This also worked on pd.pivot_table() as well. – JJAN Apr 22 '20 at 19:07

Reshaping pandas dataframe with unstack()

1 Answers1