0

I'm trying to fill in a grouped column in a tibble by accessing the prior value in the column, but when I run the dplyr script I'm not getting the values I expect. I appears that the script is accessing the original tibble values and not the recently updated tibble values in the column.

If I am unable to access the most recent value how would I go about filling in the yr column?

I have attempted to provide a reproducible example via dput and reprex.

dput: structure(list(name = c("Jim", "Jim", "Jim", "Jim", "Joe", "Jack", "Jane", "Jane", "Jane", "Jane", "Jane", "Jane", "Jane"), yr = c(2019, 0, 0, 0, 2018, 2019, 2019, 0, 0, 2018, 0, 0, 0)), row.names = c(NA, -13L), class = c("tbl_df", "tbl", "data.frame"))

reprex:

library(tidyverse)

library(reprex)

name <- c('Jim', 'Jim', 'Jim', 'Jim', 'Joe', 'Jack','Jane','Jane','Jane','Jane','Jane','Jane','Jane')

yr <- c(2019,0,0,0,2018,2019,2019,0,0,2018,0,0,0)
t <- tibble(name, yr)
print("Tibble t - After mutate with lag")
#> [1] "Tibble t - After mutate with lag"
print(t)
#> # A tibble: 13 x 2
#>    name     yr
#>    <chr>     <dbl>
#>  1 Jim       2019
#>  2 Jim       0
#>  3 Jim       0
#>  4 Jim       0
#>  5 Joe    2018
#>  6 Jack   2019
#>  7 Jane   2019
#>  8 Jane      0
#>  9 Jane      0
#> 10 Jane   2018
#> 11 Jane      0
#> 12 Jane      0
#> 13 Jane      0
t <- t %>%
  group_by(name) %>% 
  mutate(yr = ifelse(yr==0, lag(yr), yr))
print("Tibble t - After mutate with lag")
#> [1] "Tibble t - After mutate with lag"
print(t)
#> # A tibble: 13 x 2
#> # Groups:   name [4]
#>    name     yr
#>    <chr> <dbl>
#>  1 Jim    2019
#>  2 Jim    2019
#>  3 Jim       0
#>  4 Jim       0
#>  5 Joe    2018
#>  6 Jack   2019
#>  7 Jane   2019
#>  8 Jane   2019
#>  9 Jane      0
#> 10 Jane   2018
#> 11 Jane   2018
#> 12 Jane      0
#> 13 Jane      0

Created on 2019-09-04 by the reprex package (v0.3.0)

Krantz
  • 1,424
  • 1
  • 12
  • 31
Mutuelinvestor
  • 3,384
  • 10
  • 44
  • 75
  • 1
    Are you just looking to group by `name` and fill the `0`s with last non-zero `yr`? If yes, go for `t %>% group_by(name) %>% tidyr::fill(yr)`. Convert `0`s to `NA` first although I have feeling you changed `NA`s to `0` to begin with. – Shree Sep 04 '19 at 20:27
  • @shree thanks for the suggestion, but that did not seem to do the trick. It seems to have sorted by name and then performed a fill. Not sure why it sorted by name. – Mutuelinvestor Sep 04 '19 at 20:37
  • You need to convert the `0` to `NA` first. – Shree Sep 04 '19 at 20:41
  • @Shree - Yes, that did the trick. Many thanks. – Mutuelinvestor Sep 04 '19 at 20:50

0 Answers0