My data essentially looks like this, in a much shortened form:
df <- data.frame(id = c(1,1,1,2,2,2,2,2,3,3,3,3),
height = c(150, NA, 151, NA, NA, 176, 175, 174, 198, NA, 197, 198))
What I would like to do is compute the mean height for each of these IDs and then plug that height in for every NA for that given ID. So ID 1 should have a mean height of 150.5, thus the first NA should be replaced by 150.5. Then ID 2 has a mean height of 175, so I'd like to plug in 175 for the two NAs associated with ID 2. And so on.
I know I could manually enter these with things like df[2,2] <- 150.5
, but in reality I have thousands of IDs and this wouldn't be feasible.
I'm pretty comfortable with the dplyr
package and I figure I should utilize group_by(id)
somehow, but I can't figure out the rest.
Any suggestions?