1

It is categorized in A and B.

df :
Category    A      A      A       B       B 
CODE       U-01   U-02   U-03    U-04    U-05
n1          0      1      0       2       nan
n2          1      1      0       nan     nan
n3          3      0     nan       0       2

I want to count values based on standard value "0", ">0" and nan.

My desired output table would look like:

Category                 A                   B
Standard           0    >0    nan      0    >0     nan 
 n1                2     1     0       0     1      1
 n2                1     2     0       0     0      2
 n3                0     1     1       1     1      0
example) "n1-> Standard : 0 " is A&U-01, A&U-03 So, 2

Please help me,,

democracii
  • 89
  • 7

1 Answers1

1

Use DataFrame.unstack for reshape DataFrame for Series with Multiindex, then convert values gretaer like 0 to >0, replace missing values to string nan and count them by SeriesGroupBy.value_counts with reshape by Series.unstack:

df1 = (df.unstack()
         .mask(lambda x: x.gt(0), '>0')
         .fillna('nan')
         .groupby(level=[0, 2])
         .value_counts()
         .unstack([0,2], fill_value=0)
         .rename(columns={0:'0'}))
print (df1)
Category  A         B       
          0 >0 nan >0 nan  0
n1        2  1   0  1   1  0
n2        1  2   0  0   2  0
n3        1  1   1  1   0  1
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • Do you know how to solve this error ? '`'>' not supported between instances of 'str' and 'int'` – democracii Aug 25 '20 at 06:56
  • 1
    @democracii - yes, problem is soem values in some column are not numeric, need `df = df.apply(lambda x: pd.to_numeric(x, error='coerce'))` for convert this values to `NaN`s, more info [here](https://stackoverflow.com/questions/15891038/change-data-type-of-columns-in-pandas) – jezrael Aug 25 '20 at 06:59
  • In this result, If there is no data in columns, it doesn't be expressed,,, – democracii Aug 25 '20 at 08:07
  • @democracii - so there is not some spaces before numbers? How working `df = df.apply(lambda x: pd.to_numeric(x.str.strip(), error='coerce'))` ? – jezrael Aug 25 '20 at 08:08
  • Yes it works! so I solved this problem. but there is another problem. In my actual data, If n1,n2,n3 's A & >0 value is nothing, It isn't expressed with column A&>0,,, – democracii Aug 25 '20 at 08:11
  • @democracii - Can you create new question? – jezrael Aug 25 '20 at 08:13
  • 1
    Yes! Thank you, I created new question. – democracii Aug 25 '20 at 08:44
  • 1
    @democracii - Understand what need and added answers. – jezrael Aug 25 '20 at 08:53