fillna and replace based on column condition python pandas

Question

i have a df with MachineType, Prod/RT, and several other columns. MachineType contains either TRUE or FALSE. need to .fillna and .replace but in different ways for MachineType. (filling values are different for TRUE and FALSE)

Dataframe : updatedDf

my code do above calc:

updatedDf['Prod/RT']=updatedDf[updatedDf['MachineType']==True]['Prod/RT'].replace(np.inf,0.021660)
updatedDf['Prod/RT']=updatedDf[updatedDf['MachineType']==True]['Prod/RT'].fillna(0.021660)


updatedDf['Prod/RT']=updatedDf[updatedDf['MachineType']==False]['Prod/RT'].replace(np.inf,0.050261)
updatedDf['Prod/RT']=updatedDf[updatedDf['MachineType']==False]['Prod/RT'].fillna(0.050261)

But my code gives an unexpected output with Nan values. Is there any way to fix this error?or can't we .fillna and .replace like above way?

The question, as it is now, is missing key information, e.g. sample data, sample output. It is likely to be down-voted and got closed.... — Quang Hoang, May 03 '21 at 16:15
Also see [reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples). — BigBen, May 03 '21 at 16:26
it appers you are mostly there... I'd recommend using `df.loc[mask,col] = df.loc[mask,col].fillna(...)` — Rob Raymond, May 03 '21 at 16:43

score 1 · Accepted Answer · answered May 03 '21 at 16:53

My approach to your problem is to wrap the filling and replacing in a function and use it as parameter in pandas .apply(). Using you approach will require the usage of .loc[].

updatedDf = pd.DataFrame({
    'MachineType' : np.random.choice([True, False], 10, True),
    'Prod/RT' : np.random.choice([np.nan, np.inf, random.random()], 10, True)
})

# solution 1
prod_RT_dict = {True:0.21660, False:0.050261}
def fillProd_RT(row):
    if row['Prod/RT'] != np.inf and pd.notna(row['Prod/RT']):
        return row['Prod/RT']
    else:
        return prod_RT_dict[row['MachineType']]
updatedDf['Prod/RT_2'] = updatedDf.apply(fillProd_RT, axis=1)

# solution 2
updatedDf['Prod/RT_3']=updatedDf['Prod/RT'].replace(np.inf,np.nan)
updatedDf.loc[updatedDf['MachineType']==True,'Prod/RT_3']=updatedDf\
    .loc[updatedDf['MachineType']==True,'Prod/RT_3'].fillna(0.021660)
updatedDf.loc[updatedDf['MachineType']==False,'Prod/RT_3']=updatedDf\
    .loc[updatedDf['MachineType']==False,'Prod/RT_3'].fillna(0.050261)

updatedDf

updatedDf\ .loc[updatedDf['MachineType']==False,'Prod/RT_3'].fillna(0.050261) what does \ meaning after updatedDf? — johnson, May 03 '21 at 18:07
`\` is a wrapper for long one line code. I added it because I didn't want to write those line in one long line. If you don't understand my explanation, read [Breaking a line of python to multiple lines?](https://stackoverflow.com/questions/14417571/breaking-a-line-of-python-to-multiple-lines) — Gusti Adli, May 03 '21 at 18:13

fillna and replace based on column condition python pandas

1 Answers1