0

I have been trying to sort a pivot table generated by pandas. This where I am at so far:

pv=pd.pivot_table(data,index=["TIN"], columns=["Tax Type"], values=["FileFreq"], aggfunc=[np.sum], margins=True)

A subset of the data I am working on is as follows:

TIN Tax Type FileFreq
875-729 CWT 1
875-729 CWT 1
875-729 CWT 1
875-729 CWT 1
875-729 EWT 1
875-729 EWT 1
875-729 EWT 1
875-729 EWT 1
875-729 EWT 1
138-075 CWT 1
138-075 CWT 1
138-075 CWT 1
138-075 CWT 1
138-075 EWT 1
138-075 EWT 1
138-075 EWT 1
138-075 EWT 1
138-075 EWT 1
138-075 EWT 1
138-075 EWT 1
944-507 CWT 1
944-507 CWT 1
944-507 EWT 1
944-507 EWT 1
944-507 EWT 1
944-507 EWT 1

A snapshot of the resulting pivot table I got from the whole dataset is given below:

                             Tax Type Sum of File Freq   
TIN CWT EWT All
130-344 11 7 18
138-075 12 9 21
151-387 12 12 24
230-328 12 12 24
257-018 12 12 24
484-606 9 9 18
597-665 10 10 20
875-729 10 12 22
894-747 12 12 24
944-507 6 11 17
All 106 106 212

I would like to sort from lowest to highest first by the CWT column (one table), then by the EWT column (another table). Have been trying different recommendations from similar questions posted here in SO but to no avail. One of those I tried is

pvsort = pv.reindex(pv['FileFreq'].sort_values(by=['CWT']).index)

but I am getting an error Thanks in advance for the help

Barbaros Özhan
  • 59,113
  • 10
  • 31
  • 55
  • 1
    did you use sort_values? `pv.sort_values(by='CWT')` ? if yes what exactly is the problem, please elaborate. also if possible , provide us with a reproducible dataset – anky Dec 24 '20 at 11:48
  • thanks anky. sorry it is my first time to ask a question here. how do i send a data set? – Bryan Gobaco Dec 24 '20 at 12:00
  • [How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) and [How do I ask a good question?](https://stackoverflow.com/help/how-to-ask) are a good reads :-) – anky Dec 24 '20 at 12:03
  • Remember that levels in pivot table are stored in MultiIndex objects. You can sort data by column names but you need to be more specific in defining them. You can use print(pv.columns) to see what I mean. If you provide a dataset, we will be able to give you more detailed answer. – Arkadiusz Dec 24 '20 at 13:52
  • You can't format code (properly that is) in comments. You should edit your question instead. – tgrandje Dec 24 '20 at 15:03
  • thanks, edited my question already to provide a subset of the data – Bryan Gobaco Dec 24 '20 at 15:21

0 Answers0