I am trying to replace the missing values in a dataframe based on filtering of another column, "Country"
>>> data.head()
Country Advanced skiers, freeriders Snow parks
0 Greece NaN NaN
1 Switzerland 5.0 5.0
2 USA NaN NaN
3 Norway NaN NaN
4 Norway 3.0 4.0
Obviously this is just a small snippet of the data, but I am looking to replace all the NaN
values with the average value for each feature.
I have tried grouping the data by the country and then calculating the mean of each column. When I print out the resulting array, it comes up with the expected values. However, when I put it into the .fillna()
method, the data appears unchanged
I've tried @DSM's solution from this similar post, but I am not sure how to apply it to multiple columns.
listOfRatings = ['Advanced skiers, freeriders', 'Snow parks']
print (data.groupby('Country')[listOfRatings].mean().fillna(0))
-> displays the expected results
data[listOfRatings] = data[listOfRatings].fillna(data.groupby('Country')[listOfRatings].mean().fillna(0))
-> appears to do nothing to the dataframe
Assuming this is the complete dataset, this is what I would expect the results to be.
Country Advanced skiers, freeriders Snow parks
0 Greece 0.0 0.0
1 Switzerland 5.0 5.0
2 USA 0.0 0.0
3 Norway 3.0 4.0
4 Norway 3.0 4.0
Can anyone explain what I am doing wrong, and how to fix the code?