1
library(tidyverse)
testdata <- tibble(ID=c(1,NA,NA,2,NA,3),
                   Observation = LETTERS[1:6])

testdata1 <- testdata %>% 
  mutate(
    ID1 = case_when(
      is.na(ID) ~ lag(ID, 1),
      TRUE ~ ID
    )
  )

testdata1

I have a dataset like testdata, with a valid ID only when ID changes. There can be an arbitrary number of records in a set, but the above case_when and lag() structure does not fill in ID for all records, just for record 2 in each group. Is there a way to get the 3rd (or deeper) IDs filled with the appropriate value?

evidently
  • 43
  • 4

1 Answers1

1

We can use fill from the tidyr package. Since you are using tidyverse, tidyr is already inlcuded.

testdata1 <- testdata %>% 
  fill(ID)

testdata1
# # A tibble: 6 x 2
#      ID Observation
#   <dbl> <chr>      
# 1     1 A          
# 2     1 B          
# 3     1 C          
# 4     2 D          
# 5     2 E          
# 6     3 F

Or we can use na.locf from the zoo package.

library(zoo)

testdata1 <- testdata %>% 
  mutate(ID = na.locf(ID))

testdata1
# # A tibble: 6 x 2
#      ID Observation
#   <dbl> <chr>      
# 1     1 A          
# 2     1 B          
# 3     1 C          
# 4     2 D          
# 5     2 E          
# 6     3 F 
www
  • 38,575
  • 12
  • 48
  • 84