fill in values between the start and end of multiple values

Question

I have a similar question to this post: Fill in values between start and end value in R

The difference is that I need to fill in values between the start and end of multiple values and it doesn’t contain and ID column:

My data look like this (Original data have many different values) :

My final result should look like this :

Data :

structure(list(elevation = c(150L,140L, 130L, 120L, 110L, 120L, 130L, 140L, 150L, 90L, 80L, 70L,66L, 60L, 50L, 66L, 70L, 72L, 68L, 65L, 60L, 68L, 70L),code = c(NA, NA, "W", NA, NA, NA, "W", NA, NA, NA, NA, NA, "X", NA, NA, "X", NA, NA, "Y", NA, NA, "Y", NA)), class = "data.frame", row.names = c(NA,-23L))

Thanks in advance

Can you please update your post by adding your desired output as text not as an image? — Ed_Gravy, Nov 09 '22 at 20:08
I didn't know how to do it so I provided pictures AND the actual data... — AnonX, Nov 09 '22 at 20:32

score 5 · Accepted Answer · answered Nov 09 '22 at 20:20

df %>% 
   mutate(code = runner::fill_run(code, only_within = T))

   elevation code
1        150 <NA>
2        140 <NA>
3        130    W
4        120    W
5        110    W
6        120    W
7        130    W
8        140 <NA>
9        150 <NA>
10        90 <NA>
11        80 <NA>
12        70 <NA>
13        66    X
14        60    X
15        50    X
16        66    X
17        70 <NA>
18        72 <NA>
19        68    Y
20        65    Y
21        60    Y
22        68    Y
23        70 <NA>

score 3 · Answer 2 · answered Nov 09 '22 at 20:13

This may not be pretty but it works:

codepos <- which(!is.na(dd$code))
stopifnot(length(codepos)%%2==0)
for (group in split(codepos, (seq_along(codepos)+1)%/%2)) {
  stopifnot(dd$code[group[1]] == dd$code[group[2]])
  dd$code[group[1]:group[2]] <- dd$code[group[1]]
}

We start by finding all the non-NA code. We assume that they are always paired values and then just fill in the ranges for each of the pairs

Andre Wildberg · Answer 3 · 2022-11-09T23:26:55.743

Here's a tidyverse approach. It generates a temporary grouping by assigning values to the pattern given through the alternating NAs and characters.

library(dplyr)
library(tidyr)

df %>% 
  mutate(n = n(), l_c = lag(code)) %>% 
  group_by(grp = cumsum(lag(!is.na(code), default = F) == is.na(code)), 
           grp_in = grp %in% seq(2, unique(n), 4)) %>% 
  fill(l_c) %>%
  ungroup() %>% 
  mutate(code = ifelse(grp_in, l_c, code)) %>% 
  select(elevation, code) %>%
  print(n = Inf)
# A tibble: 23 × 2
   elevation code
       <int> <chr>
 1       150 NA
 2       140 NA
 3       130 W
 4       120 W
 5       110 W
 6       120 W
 7       130 W
 8       140 NA
 9       150 NA
10        90 NA
11        80 NA
12        70 NA
13        66 X
14        60 X
15        50 X
16        66 X
17        70 NA
18        72 NA
19        68 Y
20        65 Y
21        60 Y
22        68 Y
23        70 NA

fill in values between the start and end of multiple values

3 Answers3