Impute Column Values Based on Other Column's Conditions - Python Pandastic Way

Question

I have a data frame like this,

    cat1     cat2   cat3 day
0   10       25     12   Monday 
1   11       22     14   Tuesday
2   12       30     15   Wednesday
3   10       36     12   Thursday
4   15       38     14   Friday
5   NaN      42     13   Monday
6   11       32     10   Tuesday
7   15       30     12   Wednesday
8   13       28     09   Thursday
9   10       29     10   Friday
10  12       30     16   Monday

Now I want to replace the NaN value with the mean of cat1 column that is filtered by Monday from day column. My output would be,

        cat1     cat2   cat3 day
    0   10       25     12   Monday 
    1   11       22     14   Tuesday
    2   12       30     15   Wednesday
    3   10       36     12   Thursday
    4   15       38     14   Friday
    5   11       42     13   Monday
    6   11       32     10   Tuesday
    7   15       30     12   Wednesday
    8   13       28     09   Thursday
    9   10       29     10   Friday
    10  12       30     16   Monday

I was thinking of doing this filtering my dataframe by day column's Monday value, replace NaN with .mean() function and then append it back to the dataframe. For example,

monday_df = df[df['day'] == 'Monday']
mean_val  = monday_df['cat1'].mean()
monday_df['cat1']= Monday_df['cat1'].fillna(mean_val)

After this, append it back to the original df. But this seems to be lot of coding, is there a pandastic way (possibly one liner) to do this so I get my desired output?

`df['cat1'].fillna(df.loc[df['day']=='Monday']['cat1'].mean())` ? — harvpan, Jun 26 '19 at 17:35
Wrong dup, this is `mean`: https://stackoverflow.com/questions/19966018/pandas-filling-missing-values-by-mean-in-each-group — Erfan, Jun 26 '19 at 17:36
@harvpan this is giving me a column of values :( I need a whole `df` as output. — user9431057, Jun 26 '19 at 17:45
@Erfan your code is missing a `]` and also just returning a column of values. But I need a whole dataframe as output — user9431057, Jun 26 '19 at 17:46
@user9431057 that is your imput-ed `cat1`. :( `df['cat1']=df['cat1'].fillna(df.loc[df['day']=='Monday']['cat1'].mean())` updates your `df`. — harvpan, Jun 26 '19 at 17:51

Impute Column Values Based on Other Column's Conditions - Python Pandastic Way

0 Answers0