0

Would like to sort a pandas pivot by its values.

data = {'Counterparty': {0: 'A',
  1: 'B',
  2: 'B',
  3: 'A',
  4: 'A',
  5: 'C',
  6: 'D',
  7: 'E',
  8: 'E',
  9: 'C',
  10: 'F',
  11: 'C',
  12: 'C',
  13: 'G'},
 'Contract': {0: 'A1',
  1: 'B1',
  2: 'B2',
  3: 'A2',
  4: 'A3',
  5: 'C1',
  6: 'D1',
  7: 'E1',
  8: 'E2',
  9: 'C2',
  10: 'F1',
  11: 'C3',
  12: 'C4',
  13: 'G'},
 'Delivery': {0: '1/8/2019',
  1: '1/8/2019',
  2: '1/8/2019',
  3: '1/8/2019',
  4: '1/8/2019',
  5: '1/8/2019',
  6: '1/8/2019',
  7: '1/8/2019',
  8: '1/8/2019',
  9: '1/8/2019',
  10: '1/8/2019',
  11: '1/8/2019',
  12: '1/8/2019',
  13: '1/8/2019'},
 'Price': {0: 134.0,
  1: 151.0,
  2: 149.0,
  3: 134.0,
  4: 132.14700000000002,
  5: 150.0,
  6: 134.566,
  7: 153.0,
  8: 151.0,
  9: 135.0,
  10: 149.0,
  11: 135.0,
  12: 147.0,
  13: 151.0},
 'Balance': {0: 200.0,
  1: 54.87,
  2: 200.0,
  3: 133.44,
  4: 500.0,
  5: 500.0,
  6: 1324.05,
  7: 279.87,
  8: 200.0,
  9: 20.66,
  10: 110.15,
  11: 100.0,
  12: 100.0,
  13: 35.04}}

Get df from data: df = pd.DataFrame(data)

Define a function to calculate weighted average of prices

wa = lambda x: np.average(x, weights=df.loc[x.index, 'Balance'])

pivot = df.pivot_table(
    index=['Counterparty', 'Contract'],
    columns='Delivery',
    values=['Balance', 'Price'],
    aggfunc={
        'Balance': sum,
        'Price': wa
    },
    margins=True
).fillna('').swaplevel(0,1,axis=1).sort_index(axis=1).round(3)

Results:

Delivery               1/8/2019          1/9/2019           All         
                       Balance    Price  Balance    Price   Balance    Price
Counterparty Contract                                                      
A            A1            200      134                     200.00  134.000
             A2         133.44      134                     133.44  134.000
             A3            500  132.147                     500.00  132.147
B            B1          54.87      151                      54.87  151.000
             B2            200      149                     200.00  149.000
C            C1            500      150                     500.00  150.000
             C2                            20.66      135    20.66  135.000
             C3            100      135                     100.00  135.000
             C4                              100      147   100.00  147.000
D            D1                          1324.05  134.566  1324.05  134.566
E            E1         279.87      153                     279.87  153.000
             E2            200      151                     200.00  151.000
F            F1         110.15      149                     110.15  149.000
G            G           35.04      151                      35.04  151.000
All                    2313.37  143.541  1444.71  135.433  3758.08  140.424

Check multiindex for column order:

MultiIndex([('1/8/2019', 'Balance'),
            ('1/8/2019',   'Price'),
            ('1/9/2019', 'Balance'),
            ('1/9/2019',   'Price'),
            (     'All', 'Balance'),
            (     'All',   'Price')],
           name=['Delivery', 'None'])

Rename multiindex and try to sort by passing in a list containing a tuple (Reference: Multi Index Sorting in Pandas):

pivot.columns.name = ['Delivery', 'Metrics']

MultiIndex([('1/8/2019', 'Balance'),
            ('1/8/2019',   'Price'),
            ('1/9/2019', 'Balance'),
            ('1/9/2019',   'Price'),
            (     'All', 'Balance'),
            (     'All',   'Price')],
           name=['Delivery', 'Metrics'])

pivot.sort_values(by=[('Metrics', 'Delivery')], ascending=False)

Results in a keyerror.

Desired results:

Delivery               1/8/2019          1/9/2019          All         
                       Balance    Price  Balance    Price  Balance    Price
Counterparty Contract                                                      
C            C1        500      150                        500.00  150.000
A            A3        500      132.147                    500.00  132.147
E            E1        279.87   153                        279.87  153
...
yongsheng
  • 376
  • 3
  • 19

1 Answers1

1

First remove fillna for avoid mixed values numeric and strings and then sorting by tuples created from MultiIndex, not by MultiIndex.columns.names. Last if need All row to last row add concat:

df = pd.DataFrame(data)
#print (df)

wa = lambda x: np.average(x, weights=df.loc[x.index, 'Balance'])

pivot = df.pivot_table(
    index=['Counterparty', 'Contract'],
    columns='Delivery',
    values=['Balance', 'Price'],
    aggfunc={
        'Balance': sum,
        'Price': wa
    },
    margins=True
).swaplevel(0,1,axis=1).sort_index(axis=1).round(3)
pivot.columns.name = ['Delivery', 'Metrics']

df = pivot.sort_values(by=[('1/8/2019', 'Balance'), ('1/8/2019', 'Price')], ascending=False)

df = pd.concat([df.iloc[1:], df.iloc[[0]]])

print (df)
Delivery              1/8/2019          1/9/2019               All         
                       Balance    Price  Balance    Price  Balance    Price
Counterparty Contract                                                      
C            C1         500.00  150.000      NaN      NaN   500.00  150.000
A            A3         500.00  132.147      NaN      NaN   500.00  132.147
E            E1         279.87  153.000      NaN      NaN   279.87  153.000
             E2         200.00  151.000      NaN      NaN   200.00  151.000
B            B2         200.00  149.000      NaN      NaN   200.00  149.000
A            A1         200.00  134.000      NaN      NaN   200.00  134.000
             A2         133.44  134.000      NaN      NaN   133.44  134.000
F            F1         110.15  149.000      NaN      NaN   110.15  149.000
C            C3         100.00  135.000      NaN      NaN   100.00  135.000
B            B1          54.87  151.000      NaN      NaN    54.87  151.000
G            G           35.04  151.000      NaN      NaN    35.04  151.000
C            C2            NaN      NaN    20.66  135.000    20.66  135.000
             C4            NaN      NaN   100.00  147.000   100.00  147.000
D            D1            NaN      NaN  1324.05  134.566  1324.05  134.566
All                    2313.37  143.541  1444.71  135.433  3758.08  140.424

If need sorting by all columns:

df = pivot.sort_values(by=pivot.columns.tolist(), ascending=False)

df = pd.concat([df.iloc[1:], df.iloc[[0]]])
print (df)
Delivery              1/8/2019          1/9/2019               All         
                       Balance    Price  Balance    Price  Balance    Price
Counterparty Contract                                                      
C            C1         500.00  150.000      NaN      NaN   500.00  150.000
A            A3         500.00  132.147      NaN      NaN   500.00  132.147
E            E1         279.87  153.000      NaN      NaN   279.87  153.000
             E2         200.00  151.000      NaN      NaN   200.00  151.000
B            B2         200.00  149.000      NaN      NaN   200.00  149.000
A            A1         200.00  134.000      NaN      NaN   200.00  134.000
             A2         133.44  134.000      NaN      NaN   133.44  134.000
F            F1         110.15  149.000      NaN      NaN   110.15  149.000
C            C3         100.00  135.000      NaN      NaN   100.00  135.000
B            B1          54.87  151.000      NaN      NaN    54.87  151.000
G            G           35.04  151.000      NaN      NaN    35.04  151.000
D            D1            NaN      NaN  1324.05  134.566  1324.05  134.566
C            C4            NaN      NaN   100.00  147.000   100.00  147.000
             C2            NaN      NaN    20.66  135.000    20.66  135.000
All                    2313.37  143.541  1444.71  135.433  3758.08  140.424
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • thanks very much @jezrael.. i understand now to sort the columns using tuples from multiindex .. however, when sorting the entire df (which has a few more months), some months didnt sort properly. do you mind if we continue discussion in chat ? – yongsheng Sep 05 '19 at 12:07
  • @yongsheng - sure, can you invite me? – jezrael Sep 05 '19 at 12:07