1

I have a table in DataFrame taken from excel:

col A      ColB  colC  colD   
123451      a     w     p
123452      b     x     q
123453      c     y     r
123454      a     x     
123454      a     w     p 

And I want something like this using pandas.pivot_table:

colC   p  q  r  "unassigned" "total"
 w     2  0  0      0           2
 x     0  1  0      1           2
 y     0  0  1      0           1
sudoCoder
  • 115
  • 2
  • 13

2 Answers2

2

You can use crosstab for first columns and then check missing values with isna and aggregate by agg for count by sum and total by size, last join togehter by DataFrame.join:

df1 = pd.crosstab(df.colC, df.colD)
print (df1)
colD  p  q  r
colC         
w     2  0  0
x     0  1  0
y     0  0  1

df2 = (df['colD'].isna()
                 .astype(int)
                 .groupby(df['colC'])
                 .agg([('unassigned','sum'),('total','size')]))
print (df2)
      unassigned  total
colC                   
w              0      2
x              1      2
y              0      1

df = df1.join(df2).reset_index()
print (df)
  colC  p  q  r  unassigned  total
0    w  2  0  0           0      2
1    x  0  1  0           1      2
2    y  0  0  1           0      1
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
1

You can replace all the None with 'unassigned'. Then use crosstab to get respective counts. Use sum with proper axis for total count.

Following is the code for doing this

df1 = df[['colC', 'colD']].fillna('unassigned')
df1 = pd.crosstab(df1.colD, df1.colD)
df1['total'] = df1.sum(axis=1)

Following is the output for the code

D   p   q   r   unassigned  total
C                   
w   2   0   0    0           2
x   0   1   0    1           2
y   0   0   1    0           1
Keval Dave
  • 2,777
  • 1
  • 13
  • 16