-1

enter image description here

how do I divide each number by the sum? (skipping zeros) I'd like to divide each row by it's sum

ex. 0(number on 0 column)/2(sum column)

0 1 2 3 4 5 6 7 ... sum
0 0 0 0 1 0 0 0 ...  2

result

0 1 2 3 4   5 6 7 ... sum
0 0 0 0 0.5 0 0 0      2
dr.dt
  • 13
  • 2
  • 2
    can you please post the desired output so we can know what you are talking about. – Joe Ferndz Nov 23 '20 at 09:43
  • I've post the desired output – dr.dt Nov 23 '20 at 09:56
  • 1
    Can you please explain better what is the task you are trying to accomplish? "each number by sum of each row" is quite vague. Also, can you add the code you have working on so far? Finally, can you post the dataframe in a way that is usable for other to work on a solution? The screenshot of a dataframe is not workable. Please check https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples – alec_djinn Nov 23 '20 at 10:08

2 Answers2

0

You can try something like this

#### this will contain everyother column except sum
required_columns = df.columns[~df.contains.str.contains('sum')]
### regex can also be used with contains , I m here assuming you all other column will not be named as sum , for which the division is to be performed


for col in required_colums:
    print (f'---------- {col} --------')
    df.loc[:,col] = df.loc[:,col]/df.loc[:,'sum']

Vaebhav
  • 4,672
  • 1
  • 13
  • 33
0

You can also give this to get the same answer.

df.iloc[:,:-1] = df.apply(lambda r: r/r['sum'] if r['sum'] != 0 else r['sum'],axis=1).round(2)

The output of this will be:

Source df:

   0  1  2  3  4  5  6  7  sum
0  0  0  0  0  1  0  0  0    2
1  0  0  0  6  0  0  0  0   18
2  0  0  0  0  1  0  0  0    0
3  0  0  3  0  0  0  4  0    1

This will result in:

     0    1    2     3    4    5    6    7  sum
0  0.0  0.0  0.0  0.00  0.5  0.0  0.0  0.0    2
1  0.0  0.0  0.0  0.33  0.0  0.0  0.0  0.0   18
2  0.0  0.0  0.0  0.00  0.0  0.0  0.0  0.0    0
3  0.0  0.0  3.0  0.00  0.0  0.0  4.0  0.0    1

Here is the explanation for the above code:

On the left hand side of the equation, I have iloc. You can get more documentation of iloc here.

df.iloc[:,:-1]

Here I am picking all the rows (first set of :,). The second set is the columns. I am having the right hand side computational value assigned to all but the last column which is the sum column. I dont want to replace that value.

df.apply will process the dataframe one row at a time. see examples of df.apply here

Here I am picking the first row (r) and processing it. You wanted to compute column (x) / column('sum'). Thats what i am doing. It does this for each column in the row.

I am also checking if r['sum'] is not equal to zero to avoid division by zero error. If the value of r['sum'] is zero, then i am sending r['sum'] (or zero).

A DataFrame object has two axes: “axis 0” and “axis 1”. “axis 0” represents rows and “axis 1” represents columns. I am using axis = 1 to traverse through the row instead of values in each column.

Hope this explanation helps.

Joe Ferndz
  • 8,417
  • 2
  • 13
  • 33
  • Hi! could you explain the process behind it if it's not too troublesome? I'm a newbie and trying to learn. I would very much appreciate it. – dr.dt Nov 24 '20 at 02:53
  • Sorry about not adding details to the answer. Give me a few mins and I will explain the code for you in detail. – Joe Ferndz Nov 24 '20 at 03:19
  • Not at all. You're the one helping me out here. Thank you very much. – dr.dt Nov 24 '20 at 03:47
  • I have provided some writeup. see if it helps. You may need to practice a few examples to get a hang of this. Dont forget to upvote. – Joe Ferndz Nov 24 '20 at 04:59