1

I have my dtatframe and would like to present the data where the vlaues in the pivot table are simply the count of the strings, which are my columns in the pivot table:

Sample of my df:

trading_book    state
A               Traded Away
B               Dealer Reject
C               Dealer Reject
A               Dealer Reject
B               Dealer Reject
C               Dealer Reject
A               Dealer Reject
D               Dealer Reject
D               Dealer Reject
E               Dealer Reject

Desired Result:

    Traded Away Dealer Reject   Done
Book            
A          1           2          0
B          0           2          0
C          0           2          0
D          0           2          0
E          0           1          0

When I tried this with the following code:

Count_Row = df.shape[0] #gives number of row count
Count_Col = df.shape[1] #gives number of col count
df_Sample = df[['trading_book','state']].head(Count_Row-1)
display(df_Sample)

display(pd.pivot_table(
                   df_Sample, 
                   index=['trading_book'],
                   columns=['state'], 
                   values='state',
                   aggfunc='count'
              ))

I get only the trading books displaying

What needs to be done with the values and aggfunc paramaters?

cs95
  • 379,657
  • 97
  • 704
  • 746
Peter Lucas
  • 1,979
  • 1
  • 16
  • 27

3 Answers3

2

Correcting your pivot_table code:

v = df.pivot_table(
         index='trading_book', 
         columns='state', 
         aggfunc='size', 
         fill_value=0
)

There's no need to specify a values parameter as long as you specify an aggfunc='size' argument. Next, to get your exact output, you'll need to reindex your dataframe along the columns:

v.reindex(columns=np.append(df.state.unique(), 'Done'), fill_value=0)

state         Traded Away  Dealer Reject  Done
trading_book                                  
A                       1              2     0
B                       0              2     0
C                       0              2     0
D                       0              2     0
E                       0              1     0

Alternatively, specify the new columns you want in a list:

cols = ['Done', ...]
v.assign(**dict.fromkeys(cols, 0))

state         Dealer Reject  Traded Away  Done
trading_book                                  
A                         2            1     0
B                         2            0     0
C                         2            0     0
D                         2            0     0
E                         1            0     0
cs95
  • 379,657
  • 97
  • 704
  • 746
1

You can use crosstab:

pd.crosstab(df.trading_book,df.state).assign(Done=0)
Out[266]: 
state         Dealer Reject  Traded Away  Done
trading_book                                  
A                         2            1     0
B                         2            0     0
C                         2            0     0
D                         2            0     0
E                         1            0     0
Allen Qin
  • 19,507
  • 8
  • 51
  • 67
1

You can use at categorical type column in a crosstab. By using categories. You are telling Pandas that this should be considered an option even if it doesn't show up in this particular data set.

states = 'Traded Away;Dealer Reject;Done'.split(';')
pd.crosstab(df.trading_book, pd.Categorical(df.state, states))

col_0         Traded Away  Dealer Reject  Done
trading_book                                  
A                       1              2     0
B                       0              2     0
C                       0              2     0
D                       0              2     0
E                       0              1     0
piRSquared
  • 285,575
  • 57
  • 475
  • 624