0

I want to fill NAs in dataset by using identification variable:id. My dataset is like this:

id      gender   age    other variables.... 
A       NA       NA 
A       NA       NA
A       f        23
B       NA       NA
B       NA       45
C       NA       NA

The number of rows are not equal; and there are some id that has no gender information, or age information, or both information.

I want to fill NA's when it has a gender/age information, so that

id      gender   age    other variables...
A       f        23 
A       f        23
A       f        23
B       NA       45
B       NA       45
C       NA       NA

I've been search for a few hours, but couldn't found how. I'll be really appreciated if anyone tell me how to do this.

Jaap
  • 81,064
  • 34
  • 182
  • 193
김남희
  • 11
  • 1
  • 1
    There are lots of dupes for this. Check `?na.locf` from `zoo` or `fill` from `tidyr` – akrun Jul 18 '17 at 07:28
  • 3
    Try `library(data.table);library(zoo);setDT(df1)[, lapply(.SD, na.locf, na.rm = FALSE, fromLast = TRUE) , id]` or `dplyr/tidyr` `df1 %>% group_by(id) %>% fill(gender, age, .direction = "up")` – akrun Jul 18 '17 at 07:32
  • Small addition to akrun's comment: If it also occurs that the information for a group is not on the last row , you might want to run the code twice, once with the argument fromLast=TRUE, and once with fromLast=FALSE. – Florian Jul 18 '17 at 07:35

0 Answers0