I am not abe to make accurate pivot table

Question

I have one dataframe which contain many columns and i am trying to make pivot table like this

Data sample

program | InWappTable | InLeadExportTrack

VIC    |  True  | 1

VIC    | True   |1

VIC | True  |1

VIC | True  | 1

Here is my code

rec.groupby(['InWappTable', 'InLeadExportTrack','program']).size()

And Expected Output is

it is advisable to paste the data as text, not images as we would not be able to copy the image. also your expected output has some rows which your input doesnot — anky, Feb 03 '19 at 17:47
Actually i have big data set which contain many columns and thousands of rows thats why i m not able to share whole rows — Malik Asad, Feb 03 '19 at 17:56
how `df.groupby(['program'])['InWappTable','InLeadExporttrack'].count().reset_index()` work? — anky, Feb 03 '19 at 18:03
df.groupby(['program'])['InWappTable','InLeadExporttrack'].count().reset_index() missing the total value counts — Malik Asad, Feb 03 '19 at 18:08
how about `df.pivot_table(index='program', values=['InWappTable', 'InLeadExporttrack'], aggfunc='count', margins=True, margins_name='Total')`? — Jacopo Repossi, Feb 03 '19 at 18:26
@MalikAsad incorrect compared to what expected result? I put 'count' as argument of `aggfunc` but could have been 'sum' or unique count.. — Jacopo Repossi, Feb 03 '19 at 18:36
@anky_91 thanks for your time and your solution is correct but missing total that i calculated with this t_wapp=df['InWappTable'].sum() — Malik Asad, Feb 03 '19 at 18:36
@JacopoRepossi I don't know why but aggfunc is not giving correct total rest of the value are same as groupby function — Malik Asad, Feb 03 '19 at 18:40

edesz · Answer 1 · 2019-02-03T18:56:08.137

I do not believe that you require a pivot_table here, though a pivot_table approach with aggfunc can also be used effectively.

Here is how I approached this

Generate some data

a = [['program','InWappTable','InLeadExportTrack'],
     ['VIC',True,1],
     ['Mall',False,15],
     ['VIC',True,101],
     ['VIC',True,1],
     ['Mall',True,74],
     ['Mall',True,11],
     ['VIC',False,44]]
df = pd.DataFrame(a[1:], columns=a[0])
print(df)

  program  InWappTable  InLeadExportTrack
0     VIC         True                  1
1    Mall        False                 15
2     VIC         True                101
3     VIC         True                  1
4    Mall         True                 74
5    Mall         True                 11
6     VIC        False                 44

First do GROUP BY with count aggregation

df_grouped = df.groupby(['program']).count()
print(df_grouped)

         InWappTable  InLeadExportTrack
program                                
Mall               3                  3
VIC                4                  4

Then to get the sum of all columns

num_cols = ['InWappTable','InLeadExportTrack']
df_grouped[num_cols] = df_grouped[num_cols].astype(int)
df_grouped.loc['Total']= df_grouped.sum(axis=0)
df_grouped.reset_index(drop=False, inplace=True)
print(df_grouped)

  program  InWappTable  InLeadExportTrack
0    Mall            3                  3
1     VIC            4                  4
2   Total            7                  7

EDIT

Based on the comments in the OP, df_grouped = df.groupby(['program']).count() could be replaced by df_grouped = df.groupby(['program']).sum(). In this case, the output is shown below

  program  InWappTable  InLeadExportTrack
0    Mall            2                100
1     VIC            3                147
2   Total            5                247

why not `df.pivot_table(index='program', values=['InWappTable', 'InLeadExporttrack'], aggfunc='count', margins=True, margins_name='Total')`, which is a one line command? Am I missing something? — Jacopo Repossi, Feb 03 '19 at 18:33
There is nothing wrong with your approach! I'll fix my wording. For me, just looking at the raw data from the OP, it seemed like `groupby` was more intuitive. Using the sample data I posted, I get the same answer using your approach as what I posted. :) — edesz, Feb 03 '19 at 18:44
oh great! I thought I was missing something, also because in the comments above he says that `aggfunc` is not giving him the right total... — Jacopo Repossi, Feb 03 '19 at 18:49

score 1 · Accepted Answer · answered Feb 03 '19 at 18:48

1

IIUC, you can try this:

df_new=df.groupby(['program'])['InWappTable','InLeadExporttrack'].count().reset_index()
total = df_new.sum()
total['program'] = 'Total'
df_new=df_new.append(total, ignore_index=True)
print(df_new)

answered Feb 03 '19 at 18:48

anky

74,114
11
41
70

I am not abe to make accurate pivot table

2 Answers2