1

I have a (101×1766) dataframe and I put a sample down.

Index  Id  Brand1   Brand2   Brand3
0      1   NaN      Good     Bad
1      2   Bad      NaN      NaN
2      3   NaN      NaN      VeryBad
3      4   Good     NaN      NaN
4      5   NaN      Good     VeryGood
5      6   VeryBad  Good     NaN

What I want to achieve is a table like that

Index  VeryBad   Bad    Good   VeryGood
Brand1  1        1      0      0
Brand2  0        0      3      0
Brand3  1        1      0      1

I could not find a solution even a wrong one at all.

So, hope to see your help

aearslan
  • 156
  • 8
  • Also: [pandas count values in each column of a dataframe](https://stackoverflow.com/q/22888434/7851470) – Georgy Dec 17 '19 at 10:05

4 Answers4

5

Let us do two steps : melt + crosstab

s=df.melt(['Id','Index'])
yourdf=pd.crosstab(s.variable,s.value)
yourdf
value     Bad  Good  VeryBad  VeryGood
variable                              
Brand1      1     1        1         0
Brand2      0     3        0         0
Brand3      1     0        1         1
BENY
  • 317,841
  • 20
  • 164
  • 234
3

Select all columns without first by DataFrame.iloc, then count values by value_counts, replace non matched missing values, convert to integers, transpose and last for change order of columns use reindex:

cols = ['VeryBad','Bad','Good','VeryGood']
df = df.iloc[:, 1:].apply(pd.value_counts).fillna(0).astype(int).T.reindex(cols, axis=1)
print (df)
        VeryBad  Bad  Good  VeryGood
Brand1        1    1     1         0
Brand2        0    0     3         0
Brand3        1    1     0         1
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
3

Here is an approach using melt and pivot_table:

(df.melt(id_vars='Id')
 .pivot_table(index='variable',
              columns='value',
              aggfunc='count',
              fill_value=0))

[out]

          Id                      
value    Bad Good VeryBad VeryGood
variable                          
Brand1     1    1       1        0
Brand2     0    3       0        0
Brand3     1    0       1        1
Chris Adams
  • 18,389
  • 4
  • 22
  • 39
1

Another way is : get_dummies with transpose + groupby()+sum()

m=pd.get_dummies(df.set_index('Id').T)
final=m.groupby(m.columns.str.split('_').str[1],axis=1).sum()

        Bad  Good  VeryBad  VeryGood
Brand1    1     1        1         0
Brand2    0     3        0         0
Brand3    1     0        1         1
anky
  • 74,114
  • 11
  • 41
  • 70