0

I have this dataframe df:

            alpha1    week_day    calendar_week
0             2.49     Freitag  2022-04-(01/07)
1             1.32     Samstag  2022-04-(01/07)
2             2.70     Sonntag  2022-04-(01/07)
3             3.81      Montag  2022-04-(01/07)
4             3.58    Dienstag  2022-04-(01/07)
5             3.48    Mittwoch  2022-04-(01/07)
6             1.79  Donnerstag  2022-04-(01/07)
7             2.12     Freitag  2022-04-(08/14)
8             2.41     Samstag  2022-04-(08/14)
9             1.78     Sonntag  2022-04-(08/14)
10            3.19      Montag  2022-04-(08/14)
11            3.33    Dienstag  2022-04-(08/14)
12            2.88    Mittwoch  2022-04-(08/14)
13            2.98  Donnerstag  2022-04-(08/14)
14            3.01     Freitag  2022-04-(15/21)
15            3.04     Samstag  2022-04-(15/21)
16            2.72     Sonntag  2022-04-(15/21)
17            4.11      Montag  2022-04-(15/21)
18            3.90    Dienstag  2022-04-(15/21)
19            3.16    Mittwoch  2022-04-(15/21)

and so on, with ascending calendar weeks. I performed a pivot table to generate a heatmap.
df_pivot = pd.pivot_table(df, values=['alpha1'], index=['week_day'], columns=['calendar_week'])
What I get is:

                  alpha1                                        \
calendar_week 2022-(04-29/05-05) 2022-(05-27/06-02) 2022-(07-29/08-04)   
week_day                                                                 
Dienstag                    3.32               2.09               4.04   
Donnerstag                  3.27               2.21               4.65   
Freitag                     2.83               3.08               4.19   
Mittwoch                    3.22               3.14               4.97   
Montag                      2.83               2.86               4.28   
Samstag                     2.62               3.62               3.88   
Sonntag                     2.81               3.25               3.77   

                                                                  \
calendar_week 2022-(08-26/09-01) 2022-04-(01/07) 2022-04-(08/14)   
week_day                                                           
Dienstag                    2.92            3.58            3.33   
Donnerstag                  3.58            1.79            2.98   
Freitag                     3.96            2.49            2.12   
Mittwoch                    3.09            3.48            2.88   
Montag                      3.85            3.81            3.19   
Samstag                     3.10            1.32            2.41   
Sonntag                     3.39            2.70            1.78   

As you see the sorting of the pivot table is messed up. I need the same sorting for the columns (calendar weeks) as in the original dataframe.
I have been looking all over but couldn't find how to achieve this.
Would be also very nice, if the sorting of the rows remains the same.

Any help will be greatly appreciated

UPDATE
I didn't paste all the data. It would have been too long
The calendar_week column consist of following elements

'2022-04-(01/07)', 
'2022-04-(08/14)',
'2022-04-(15/21)', 
'2022-04-(22/28)', 
'2022-(04-29/05-05)', 
'2022-05-(06/12)',
'2022-05-(13/19)',
'2022-05-(20/26)',
'2022-(05-27/06-02)', 
'2022-06-(03/09)'
'2022-06-(10/16)'
'2022-06-(17/23)'
'2022-06-(24/30)'
'2022-07-(01/07)'
'2022-07-(08/14)'
'2022-07-(15/21)'
'2022-07-(22/28)'
'2022-(07-29/08-04)'
'2022-08-(05/11)'
etc....

Each occurs 7 times in df. It represents a calendar week.
The sorting is the natural time sorting.
After pivoting the dataframe, the sorting of the column get messed up. And I guess it's due to the 2 different types: 2022-(07-29/08-04) and 2022-07-(15/21).

codyLine
  • 509
  • 2
  • 8
  • 26

2 Answers2

0

Try running this:

df_pivot.sort_values(by = ['calendar_week'], axis = 1, ascending = True)

I got the following output. Is this what you wanted?

calendar_week 2022-04-(01/07) 2022-04-(08/14) 2022-04-(15/21)
week_day
Dienstag 3.58 3.33 3.90
Donnerstag 1.79 2.98 NaN
Freitag 2.49 2.12 3.01
Mittwoch 3.48 2.88 3.16
Montag 3.81 3.19 4.11

be sure to remove the NaN values using the fillna() function.

I hope that answers it. :)

Inshaullah
  • 99
  • 5
0

You can use an ordered Categorical for your week days and sort the dates after pivoting with sort_index:

# define the desired order of the days
days = ['Montag', 'Dienstag', 'Mittwoch', 'Donnerstag',
        'Freitag', 'Samstag', 'Sonntag']

df_pivot = (df
            .assign(week_day=pd.Categorical(df['week_day'], categories=days,
                                            ordered=True))
            .pivot_table(values='alpha1', index='week_day',
                         columns='calendar_week')
            .sort_index(axis=1)
           )

output:

calendar_week  2022-04-(01/07)  2022-04-(08/14)  2022-04-(15/21)
week_day                                                        
Montag                    3.81             3.19             4.11
Dienstag                  3.58             3.33             3.90
Mittwoch                  3.48             2.88             3.16
Donnerstag                1.79             2.98              NaN
Freitag                   2.49             2.12             3.01
Samstag                   1.32             2.41             3.04
Sonntag                   2.70             1.78             2.72
mozway
  • 194,879
  • 13
  • 39
  • 75