4

Hi I'm trying to pivot my data from long to wide in Python. I'm getting confused by some of the other answers on here with a mix of people using pivot, or set_index and unstack, and hierarchical indexing etc.

Below is a sample df with 4 columns and 9 rows. I'd like to be able to pivot the data so that there is exactly 1 row for every unique id which includes columns like Jan 2021 sales and Jan 2021 profit. So sales & profit would each have their own column for every month.

import pandas as pd
from datetime import datetime, timedelta

sample_df = pd.DataFrame({
 'id': [1, 1, 1, 2, 2, 2, 3, 3, 3],
 'month': ['Jan 2021', 'Feb 2021', 'Mar 2021', 'Jan 2021', 'Feb 2021', 'Mar 2021', 'Jan 2021', 'Feb 2021', 'Mar 2021'],
 'sales': [100, 200, 300, 400, 500, 600, 700, 800, 900],
 'profit': [10, 20, 30, 40, 50, 60, 70, 80, 90]}
)

sample_df_pivot = xxx

Ideally it would look like this enter image description here

Any help would be appreciated !

jyablonski
  • 711
  • 1
  • 7
  • 17
  • 3
    Will you please provide a sample dataframe containing your expected output? –  Dec 20 '21 at 17:08

2 Answers2

4

Try this:

df = sample_df.pivot(columns='month', index='id')
df.columns = df.columns.swaplevel().map(' '.join)

Output:

>>> df
    Feb 2021 sales  Jan 2021 sales  Mar 2021 sales  Feb 2021 profit  Jan 2021 profit  Mar 2021 profit
id                                                                                                   
1              200             100             300               20               10               30
2              500             400             600               50               40               60
3              800             700             900               80               70               90
1

Is this what you're looking for?

sample_df.pivot(index='id', columns ='month')

         sales                     profit                  
month Feb 2021 Jan 2021 Mar 2021 Feb 2021 Jan 2021 Mar 2021
id                                                         
1          200      100      300       20       10       30
2          500      400      600       50       40       60
3          800      700      900       80       70       90

Patryk Kowalski
  • 561
  • 3
  • 13
  • It's close to something I came up with previously but I'm not sure how to manipulate the multi level indexing & the column names after that. – jyablonski Dec 20 '21 at 17:20
  • [Here](https://stackoverflow.com/questions/14507794/pandas-how-to-flatten-a-hierarchical-index-in-columns) you can find examples on how to flatten the resulting multiindex – Patryk Kowalski Dec 20 '21 at 17:31