Here is a data frame
# 5 companies observed each day for 10 days
df <- tibble(
company = rep(LETTERS[1:5], 10),
value = rep(sample(100, 5), 10),
date = rep(seq(as.Date("2020-01-01"), as.Date("2020-01-10"), 1), each = 5)
)
df
Now something happens to the data and some of the company E rows are removed.
df_error <- df[-c(5, 10, 15, 20), ]
df_error
What is the simplest Tidyverse way to add back the E rows. Value doesn't matter. The date of the E row is the same as the D row above it.
I started with the following and wasn't sure how to proceed:
# Find all D occurrences
e_idx <- which(df_error$company == "D")
e_idx
# If there is not an E in the next row, get the index. These need E rows below each index value.
rows_need_e_below <- ifelse(df_error[e_idx + 1, 1] != "E", e_idx, NA)
rows_need_e_below