0

I've used group by and pivot table from pandas package in order to create the following table:

Input:

q4 = q1[['category','Month']].groupby(['category','Month']).Month.agg({'Count':'count'}).reset_index()
q4 = pd.DataFrame(q4.pivot(index='category',columns='Month').reset_index())

then the output :

category                            Count
Month                       6       7       8
0   adult-classes           29.0    109.0   162.0
1   air-pollution           27.0    43.0    13.0
2   babies-and-toddlers     4.0     51.0    2.0
3   bicycle                 210.0   96.0    23.0
4   building                NaN     17.0    NaN
5   buildings-maintenance   23.0    12.0    NaN
6   catering                1351.0  4881.0  1040.0
7   childcare               9.0     NaN     NaN
8   city-planning           105.0   81.0    23.0
9   city-services           2461.0  2130.0  1204.0
10  city-taxes              1.0     4.0     42.0

I'm trying to add a condition to the months, the problem I'm having is that after pivoting I can't access the columns

how can I show only the rows where 6<7<8?

1 Answers1

1

To flatten your multi-index, you can use renaming of your columns (check out this answer).

q4.columns = [''.join([str(c) for c in col]).strip() for col in q4.columns.values]

To remove NaNs:

q4.fillna(0, inplace=True)

To select according to your constraint:

result = q4[(q4['Count6'] < q['Count7']) & (q4['Count7'] < q4['Count8'])]
Léopold Houdin
  • 1,515
  • 13
  • 18
  • your doubt is exactly the problem I'm having. I can't access these columns after pivoting – Matan Sheffer Sep 23 '18 at 12:09
  • Hey, found something. Check my edited answer. Now it should be working from A to Z :). – Léopold Houdin Sep 23 '18 at 12:53
  • the first multi-index gives the following error: TypeError: sequence item 1: expected str instance, int found removing NaN seems to work. the last one will no compile – Matan Sheffer Sep 23 '18 at 13:03
  • that's because in the data I was testing, all columns were `str`, but in your there were `int`. check my again edited answer ;-) – Léopold Houdin Sep 23 '18 at 13:10
  • you've done something good now it's like that "c a t e g o r y C o u n t 6 C o u n t 7 C o u n t 8 0 adult-classes 29.0 109.0 162.0 1 air-pollution 27.0 43.0 13.0 2 babies-and-toddlers 4.0 51.0 2.0 but still the last one don't work – Matan Sheffer Sep 23 '18 at 13:20
  • q4.columns = [''.join([str(c) for c in col]).strip() for col in q4.columns.values] q4.fillna(0, inplace=True) results = q4[(q4['Count6'] < q4['Count7']) & (q4['Count7'] < q4['Count8'])] results.rename(index=str, columns={"Count6": "June", "Count7": "July","Count8": "August" }) that's the final code after minor fix!!! thanks a lot!!!!! <3 – Matan Sheffer Sep 23 '18 at 13:40
  • happy to hear you did it :) – Léopold Houdin Sep 23 '18 at 14:02