Create a function iterating in Panda's Dataframe rows to replace null values

Question

I have a DataFrame with some null values that I want to substitute with mean values that I have in other DataFrame. I've created a function that it should later be implemented with a lambda but I keep getting an error.

I Have a DataFrame like this:

CustomerType	Category	Satisfaction	Age
Not Premium	Electronics	Not Satisfied	NaN
Not Premium	Beauty	Satisfied	NaN
Premium	Sports	Satisfied	38.0
Not Premium	Sports	Not Satisfied	NaN

That i need to fill with this data:

CustomerType	Satisfaction	Age
Not Premium	Not Satisfied	32.440740
Not Premium	Satisfied	28.896348
Premium	Not Satisfied	43.767723
Premium	Satisfied	44.075901

So I've created a function:

def fill_age(x):
if x.isnull()== True:
    return[(grp.CustomerType==x.CustomerType) | (grp.Satisfaction==x.Satisfaction)]['Age'].values[0]

That I would like to apply to my dataframe using a lambda function to iterate through all the rows:

df['Age'] = [df.apply(lambda x: fill_age(x) if np.isnan(x['Age']) else 
                                            x['Age'], axis=1) for x in df]

But i keep getting this error:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Can anyone of you help me?

score 0 · Answer 1 · answered Mar 07 '22 at 22:35

0

Supposing that you are calling incorrectly apply in your DataFrame and that fill_age() are working correctly on df["Age"] values, you need to replace this statement, just to evaluate x and asign a determined value (current Age or to be replace with external data) then checking by else-if conditional, this code shouldn't return errors

df["Age"] = df["Age"].apply(lambda x: fill_age(x) if np.isnan(x) else x)

answered Mar 07 '22 at 22:35

user11717481

1
9
15
25

Still not working, pretty sure the error is on the fill_age function. I get: "TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''. " – bancaletto Mar 08 '22 at 07:34

score 0 · Accepted Answer · answered Mar 07 '22 at 22:50

0

We should try avoid use apply, so we could use instead:

df['Age'] = df['Age'].fillna(
    df.groupby(['CustomerType', 'Satisfaction'])['Age'].transform('first')
)

answered Mar 07 '22 at 22:50

ansev

30,322
5
17
31

Create a function iterating in Panda's Dataframe rows to replace null values

2 Answers2