How to replace values in a row with first non zero value in pandas?

Question

I am attempting to replace all values in the row that have length > 0 with the first nonzero value. If the row has length 0, replace it with float 0.0

This is the expected input:

    VOL1    VOL2    D
    0       1       3
    21      21      
    19      0       0
    18      0

This is the expected output:

    VOL1    VOL2    D
    1       1       1
    21      21      0.0
    19      19      19  
    18      18      0.0

Thus far, this is what I have attempted:

import pandas as pd
import numpy as np

data = {
        'VOL1':[0, 21, 19, 18],
        'VOL2':[1, 21, 0, 0],
       }
 
# Create DataFrame
df = pd.DataFrame(data)
df['D'] = [3,"",0,""]

#get first nonzero
first_nonzero_df = df[df!=0].cumsum(axis=1).min(axis=1)
if df.isnull().any(axis=1):
  df.any(axis=1).replace(df, first_nonzero_df)

It's unclear to me what I'm doing wrong here, any help is appreciated. Thanks!

Column D contains cells that are supposed to get replaced with 0.0 — silvercoder, Sep 28 '21 at 20:24
I suppose I could've setup a better example. There are other columns that have numbers as well as blanks — silvercoder, Sep 28 '21 at 20:39
blanks and ``None`` are different. I guess you were trying to have ``None``, right? — Karina, Sep 28 '21 at 20:42
the data i'm sourcing actually has blanks. which is why my initial thought process was to do a replace if there's a length > 0. I've updated column D to represent what that column should look like. Apologies for the lack of clarity — silvercoder, Sep 28 '21 at 20:53
because the first non-zero value discovered was 1. The same way 19 gets updated to both the second and third columns in row 3 — silvercoder, Sep 28 '21 at 20:58
But 3 is a non-zero value. Why should it be updated? And if it *is* updated, shouldn't column D in row 2 also be updated to 21? — not_speshal, Sep 28 '21 at 21:00
everything in a row gets updated with the first non-zero value unless its a blank in which case it gets updated to 0. Column D row 2 is blank, that's why it gets updated to 0 not 21. Sorry if my setup for this wasn't clear — silvercoder, Sep 28 '21 at 22:11
@silver - So you have the same value in every column in every row except for blanks? What about a row that in `[1, 0, 2, ""]`? — not_speshal, Sep 28 '21 at 23:30

not_speshal · Answer 1 · 2021-09-28T20:59:48.057

1

IIUC, try:

>>> df.where(df!=0, df[df!=0].ffill(axis=1).bfill(axis=1)).replace("",0)
   VOL1  VOL2     D
0     1     1   3.0
1    21    21   0.0
2    19    19  19.0
3    18    18   0.0

edited Sep 28 '21 at 20:59

answered Sep 28 '21 at 20:28

not_speshal

22,093
2
15
30

score 0 · Answer 2 · answered Sep 28 '21 at 20:28

0

import pandas as pd
data = {
        'VOL1':[0, 21, 19, 18],
        'VOL2':[1, 21, 0, 0],
       }
 
# Create DataFrame
df = pd.DataFrame(data)
df['D'] = [None] * len(df)

first_nonzero_df = df[df!=0].cumsum(axis=1).min(axis=1)

keys = df.keys()
for i in range(len(df)):
    for j in range(len(keys)):
        if df[f'{keys[j]}'][i] == 0:
            df[f'{keys[j]}'][i] = first_nonzero_df[i]
df = df.fillna(0)
df

Output:

answered Sep 28 '21 at 20:28

Karina

1,252
2
5
16

1

It is generally not a good idea to iterate over DataFrames, especially when there are vectorized solutions available. See [here](https://stackoverflow.com/questions/16476924/how-to-iterate-over-rows-in-a-dataframe-in-pandas/55557758#55557758). – not_speshal Sep 28 '21 at 20:30
I didn't know that. Thanks! and your one liner seems very concise and elegant! – Karina Sep 28 '21 at 20:33
1

I always a learn a lot from comments on my code too! Thank you for taking it well - I definitely wasn't trying to criticize. – not_speshal Sep 28 '21 at 20:34

How to replace values in a row with first non zero value in pandas?

2 Answers2