3

I have the following df

Array = np.array([[87, 70, 95],
   [52, 47, 44],
   [44, 97, 94],
   [79, 36,  2]])

df_test = pd.DataFrame(Array, columns=['Apple', 'Banana', 'Tomato'],index=[['Joe', 'Steve', 'Wes', 'Jim']])

Which looks like:

       Apple  Banana  Tomato
Joe       87      70      95
Steve     52      47      44
Wes       44      97      94
Jim       79      36       2

I want to compute the share of each expense by line but I do not find. It must look like:

df_test.apply(lambda: x/max(line),axis=2)

and the results would be:

       Apple  Banana  Tomato
Joe    0.35   0.27   0.37
.        .      .      .

But I cannot find the way to compute inside the lambda function the max of each line. Does someone have idea ? Thanks in advance !

EdChum
  • 376,765
  • 198
  • 813
  • 562
Jb_Eyd
  • 635
  • 1
  • 7
  • 20

1 Answers1

3

you want to div by the sum row-wise:

In [111]:
df_test.div(df_test.sum(axis=1), axis=0) * 100

Out[111]:
          Apple    Banana    Tomato
Joe    0.345238  0.277778  0.376984
Steve  0.363636  0.328671  0.307692
Wes    0.187234  0.412766  0.400000
Jim    0.675214  0.307692  0.017094

If you want to set the precision you can call round:

In [112]:
df_test.div(df_test.sum(axis=1), axis=0).round(2)

Out[112]:
       Apple  Banana  Tomato
Joe     0.35    0.28    0.38
Steve   0.36    0.33    0.31
Wes     0.19    0.41    0.40
Jim     0.68    0.31    0.02
EdChum
  • 376,765
  • 198
  • 813
  • 562
  • Could you explain what the div function exactly do ? It's not just divided if I understand, It's mapping + dividing right ? – Jb_Eyd Apr 06 '16 at 20:02
  • 1
    So it's dividing the df against the result of sum what is also happening is that it's aligning the index and columns, you can see my other answer to see how this works: http://stackoverflow.com/questions/29954263/what-does-the-term-broadcasting-mean-in-pandas-documentation/29955358#29955358 – EdChum Apr 06 '16 at 20:17