I am trying to fill missing values from a slice of a single column of a dataframe. The reason is, three values in the column are NaN due to actual missing data. The other 1400 or so missing values are actually missing because the homes didn't have pools. For the first case I want to fill the data with the median value. For the latter case, I want to encode the missing data with 'NA', which is the appropriate value for a home with no pool.
My code looks like this, and does not work (no errors or warnings, just no results):
test_df.loc[test_df.PoolQC.isna() & (test_df.PoolArea == 0), ['PoolQC']].fillna('NA', inplace=True)
test_df.loc[test_df.PoolQC.isna() & (test_df.PoolArea > 0), ['PoolQC']].fillna(mode, inplace=True)
However, the following code works:
test_df.loc[test_df.PoolQC.isna() & (test_df.PoolArea == 0), ['PoolQC']] = 'NA'
test_df.loc[test_df.PoolQC.isna() & (test_df.PoolArea > 0), ['PoolQC']] = mode
I can't find any reason why this is happening in the documentation. I don't particularly mind using the work-around as it's actually shorter, but I'm curious as to why it's happening and what best practices are in cases like this?