0

Let's say I have data:

      a   b
0    1.0  NaN
1    6.0  1
2    3.0  NaN
3    1.0  NaN

I would like to iterate over this data to see, if Data[i] == NaN **and** column['a'] == 1.0 then replace NAN with 4 instead of replace by 4 in any NaN you see. How shall I go about it? I tried every for if function and it didn't work. I also did

for i in df.itertuples():

but the problem is df.itertuples() doesn't have a replace functionality and the other methods I've seen were to do it one by one.

End Result looking for:

      a   b
0    1.0  4
1    6.0  1
2    3.0  NaN
3    1.0  4
  • 1
    Hi Daniyal see this topic, please. I believe that it can be helpful for you . https://stackoverflow.com/questions/14162723/replacing-pandas-or-numpy-nan-with-a-none-to-use-with-mysqldb – Felipe Cabral Oct 22 '20 at 03:04

4 Answers4

0
def func(x):
    if x['a'] == 1 and pd.isna(x['b']):
        x['b'] = 4
    return x

df = pd.DataFrame.from_dict({'a': [1.0, 6.0, 3.0, 1.0], 'b': [np.nan, 1, np.nan, np.nan]}) 
df.apply(func, axis=1)

Instead of iterrows(), apply() may be a better option.

Chris Tang
  • 567
  • 7
  • 18
  • Should work! even though i am not sure why df is defined later. I thought x should be df. –  Oct 22 '20 at 04:13
  • @Daniyaldehleh `x` is the series, or the row in this case, of the `df`. `apply()` works on each row/column of the `df`, so the function should handle `x` instead of `df`. – Chris Tang Oct 22 '20 at 04:34
0

Like you said, you can achieve this by combining 2 conditions: a==1 and b==Nan.

To combine two conditions in python you can use &.

In your example:

import pandas as pd
import numpy as np

# Create sample data
d = {'a': [1, 6, 3, 1], 'b': [np.nan, 1, np.nan, np.nan]}
df = pd.DataFrame(data=d)

# Convert to numeric
df = df.apply(pd.to_numeric, errors='coerce')
print(df)

# Replace Nans
df[ (df['a'] == 1 ) & np.isnan(df['b']) ] = 4
print(df) 

Should do the trick.

drcrisp
  • 193
  • 6
  • AttributeError: module 'numpy' has no attribute 'isnull' –  Oct 22 '20 at 03:55
  • TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe'' –  Oct 22 '20 at 03:55
  • Maybe your entries are strings rather than numbers. You can add: df = df.apply(pd.to_numeric, errors='coerce'). I will modify my answer accordingly. – drcrisp Oct 22 '20 at 05:54
  • Hi there, I added the df = df.apply(pd.to_numeric, errors='coerce') . Nonetless, the error presisted. –  Oct 23 '20 at 00:14
0

You can create a mask and then fill in the intended NaNs using that mask:

df = pd.DataFrame({'a': [1,6,3,1], 'b': [np.nan, 1, np.nan, np.nan]})
mask = df[['a', 'b']].apply(lambda x: (x[0] == 1) and (pd.isna(x[1])), axis=1)
df['b'] = df['b'].mask(mask, df['b'].fillna(4))
print(df)
   a    b
0  1  4.0
1  6  1.0
2  3  NaN
3  1  4.0
Benedictanjw
  • 828
  • 1
  • 8
  • 19
0
df2 = df[df['a']==1.0].fillna(4.0)
df2.combine_first(df)

Can this help you?

horsefall
  • 41
  • 2