2

So I have a pandas pivot table and I want to keep the sum of the columns at the bottom (so the sum row). However, there is also a column that is added when I say margins=True and I don't want that.

I have tried dropping the column, but dropping a column from a multiindex table just resulted in errors. If I do drop('All', axis=1), it says KeyError: 'All', and if I do axis=0 it gets rid of the total on the bottom (which I want).

artist_table = pd.pivot_table(total_df,
                              values=['Records Created', 
                                      'Tickets Sold'],
                              index=['artist name',
                                     'artist id'],
                              columns=['date'],
                              aggfunc=np.sum,
                              fill_value=0,
                              margins=True)

What I want is:

                                 Leads Created       Revenue     
 date                             6/1                 6/2                                        
    artist_id    artist_name      
    XXX           YYY            x                   y            
    AAA           BBB            a                   b            
    All                         (x+a)             (y+b)    

What I have now is:

                                Leads Created       Revenue     
date                              6/1                 6/2           All                                
artist_id    artist_name      
    XXX           YYY              x                   y          (x+y)
    AAA           BBB              a                   b          (a+b)
    All                           (x+a)             (y+b)                    

I would like that All on the right (the sums of the rows) to be gone. Can anyone please help? Thank you in advance!

Mel
  • 21
  • 3
  • 2
    Once you've got the data in artist_table, could you use `artist_table = artist_table.drop("All", axis=1)`? – Jake Losh Jul 01 '19 at 22:54
  • 1
    adding `.drop('All', 1)` to the right of your `pivot_table` command – Andy L. Jul 01 '19 at 22:54
  • See this post: https://stackoverflow.com/questions/13411544/delete-column-from-pandas-dataframe – rey don sancho Jul 02 '19 at 09:19
  • Hi thank you for your response...unfortunately that didn't work. I did not put the entire table in the post, but I added it so you can better understand what I am working with. – Mel Jul 02 '19 at 14:05
  • Did you ever come up with a solution? Dropping the All column is driving me crazy, and I also don't need an All row like you did. However, if I set margins=False, it for some reason screws up my dropna=True because my resulting table shows the NAN. I set margins=True and the NANs are gone but I get an All column I don't want. Very strange. – Brian Jan 08 '21 at 16:07

2 Answers2

1

You can swap the headers level by

artist_table=artist_table.swaplevel(0,1,axis=1).sort_index(axis=1)

and then drop the "All" column

artist_table.drop("All", axis=1)

then you can swap the levels back

artist_table=artist_table.swaplevel(0,1,axis=1).sort_index(axis=1)
0

I ended up converting Pivot Table to Dataframe using the alternative option suggested here:

transform pandas pivot table to regular dataframe

...and then dropped the 'All' column from that new dataframe.

Brian
  • 83
  • 1
  • 1
  • 11