fill NA in datatable by using reference variable in r

Asked Jul 18 '17 at 07:27

Active Jul 18 '17 at 07:33

Viewed 55 times

I want to fill NAs in dataset by using identification variable:id. My dataset is like this:

id      gender   age    other variables.... 
A       NA       NA 
A       NA       NA
A       f        23
B       NA       NA
B       NA       45
C       NA       NA

The number of rows are not equal; and there are some id that has no gender information, or age information, or both information.

I want to fill NA's when it has a gender/age information, so that

id      gender   age    other variables...
A       f        23 
A       f        23
A       f        23
B       NA       45
B       NA       45
C       NA       NA

I've been search for a few hours, but couldn't found how. I'll be really appreciated if anyone tell me how to do this.

edited Jul 18 '17 at 07:33

Jaap

81,064
34
182
193

asked Jul 18 '17 at 07:27

김남희

1

There are lots of dupes for this. Check `?na.locf` from `zoo` or `fill` from `tidyr` – akrun Jul 18 '17 at 07:28
3

Try `library(data.table);library(zoo);setDT(df1)[, lapply(.SD, na.locf, na.rm = FALSE, fromLast = TRUE) , id]` or `dplyr/tidyr` `df1 %>% group_by(id) %>% fill(gender, age, .direction = "up")` – akrun Jul 18 '17 at 07:32
Small addition to akrun's comment: If it also occurs that the information for a group is not on the last row , you might want to run the code twice, once with the argument fromLast=TRUE, and once with fromLast=FALSE. – Florian Jul 18 '17 at 07:35

fill NA in datatable by using reference variable in r

0 Answers0