How to create a pivot table uing python pandas with column entries pivoted to column heading and a new column for blank entries?

Question

I have a table in DataFrame taken from excel:

col A      ColB  colC  colD   
123451      a     w     p
123452      b     x     q
123453      c     y     r
123454      a     x     
123454      a     w     p

And I want something like this using pandas.pivot_table:

colC   p  q  r  "unassigned" "total"
 w     2  0  0      0           2
 x     0  1  0      1           2
 y     0  0  1      0           1

@jezrael, have made a slight change to the question, can you please help? — sudoCoder, Jan 25 '19 at 08:22

score 2 · Accepted Answer · answered Jan 25 '19 at 09:12

You can use crosstab for first columns and then check missing values with isna and aggregate by agg for count by sum and total by size, last join togehter by DataFrame.join:

df1 = pd.crosstab(df.colC, df.colD)
print (df1)
colD  p  q  r
colC         
w     2  0  0
x     0  1  0
y     0  0  1

df2 = (df['colD'].isna()
                 .astype(int)
                 .groupby(df['colC'])
                 .agg([('unassigned','sum'),('total','size')]))
print (df2)
      unassigned  total
colC                   
w              0      2
x              1      2
y              0      1

df = df1.join(df2).reset_index()
print (df)
  colC  p  q  r  unassigned  total
0    w  2  0  0           0      2
1    x  0  1  0           1      2
2    y  0  0  1           0      1

Thank you @jezrael for the help! this works perfectly. – sudoCoder Jan 25 '19 at 10:17 — sudoCoder, Jan 25 '19 at 10:17

Keval Dave · Answer 2 · 2019-01-25T10:24:01.317

1

You can replace all the None with 'unassigned'. Then use crosstab to get respective counts. Use sum with proper axis for total count.

Following is the code for doing this

df1 = df[['colC', 'colD']].fillna('unassigned')
df1 = pd.crosstab(df1.colD, df1.colD)
df1['total'] = df1.sum(axis=1)

Following is the output for the code

D   p   q   r   unassigned  total
C                   
w   2   0   0    0           2
x   0   1   0    1           2
y   0   0   1    0           1

edited Jan 25 '19 at 10:24

answered Jan 25 '19 at 09:42

Keval Dave

2,777
1
13
16

This would fail where other columns also have blank entries, it would change them too! – sudoCoder Jan 25 '19 at 10:18
@sudoCoder we can slice data frame to use colC and colD only. – Keval Dave Jan 25 '19 at 10:26
That may help, I see! – sudoCoder Jan 25 '19 at 10:34

How to create a pivot table uing python pandas with column entries pivoted to column heading and a new column for blank entries?

2 Answers2