I would like to impute NA's in a dataframe with means of observed data in each country. In other words, while dealing with NAs, the values in the specific country should be taken into consideration. For instance;
Date Country Battles Riots
March 2018 Afghanistan 380 NA
March 2018 Yemen 88 5
March 2018 Mali 45 NA
April 2018 Afghanistan 350 NA
April 2018 Yemen NA 66
April 2018 Mali 67 NA
May 2018 Afghanistan NA 7
May 2018 Yemen NA NA
May 2018 Mali NA 6
I have used the following code, but obviously it calculates the means without taking the country specific information.
for(i in 6:ncol(my_data)) {
my_data[ , i][is.na(my_data[ , i])] <- mean(my_data[ , i], na.rm = TRUE)
}
Many thanks in advance.