I'm trying to use na.locf
from package zoo
with grouped data using dplyr
. I'm using the first solution on this question: Using dplyr window-functions to make trailing values (fill in NA values)
library(dplyr);library(zoo)
df1 <- data.frame(id=rep(c("A","B"),each=3),problem=c(1,NA,2,NA,NA,NA),ok=c(NA,3,4,5,6,NA))
df1
id problem ok
1 A 1 NA
2 A NA 3
3 A 2 4
4 B NA 5
5 B NA 6
6 B NA NA
The problem happens when, within a group, all the data is NA. As you can see in the problem column, the na.locf
data for id=B comes from another group: the last data of id=A.
df1 %>% group_by(id) %>% na.locf()
Source: local data frame [6 x 3]
Groups: id [2]
id problem ok
<chr> <chr> <chr>
1 A 1 <NA>
2 A 1 3
3 A 2 4
4 B 2 5 #problem col is wrong
5 B 2 6 #problem col is wrong
6 B 2 6 #problem col is wrong
This is my expected result. The data for id=B is independent of what is in id=A
id problem ok
<chr> <chr> <chr>
1 A 1 <NA>
2 A 1 3
3 A 2 4
4 B NA 5
5 B NA 6
6 B NA 6