1

I have a dataframe in which I need to replace the NA with the average of the column based on the criteria of another column.

The dataframe looks like this

ROW location col1 col2 col3 -
1 A 1 2 3 
2 A NA 3 5 
3 A 3 NA 2 
4 B 3 NA 3 
5 B NA 5 3 
6 B 3 3 5

I am trying to replace the col1 value on row 2 with the average of only those with location "A" and to not include location "B"

I have used this code below in the past to replace all NA's within the column, but now need to subset that average to just the location

for(i in 1:ncol(data)){
  data[is.na(data[,i]), i] <- mean(data[,i], na.rm = TRUE)
}

It should look like this...

ROW location col1 col2 col3 -
1 A 1 2 3 
2 A 2 3 5 
3 A 3 2.5 2 
4 B 3 4 3 
5 B 3 5 3 
6 B 3 3 5
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
Jordan
  • 23
  • 3
  • using `dplyr`, `df %>% group_by(location) %>% mutate_at(vars(starts_with("col")), ~replace(., is.na(.), mean(., na.rm = TRUE))) ` – Ronak Shah Aug 13 '19 at 00:37

0 Answers0