Yesterday I posted about how to expand/complete within group values.
Testing the solution on a minimal example df worked great, but on my real data, it's not computing after several hours.
My dplyr pipe flow looks like this:
mydf <- mydf |>
group_by_at(vars(id:trial_day)) |>
summarise_at(vars(bla:last_col()), sum) |>
complete(trial_day = 1:14)
I tried swapping out complete()
for expand()
but doing so results in the grouped vars only being kept, the other vars are dropped e.g.
df <- data.frame(
id = rep('a', 5),
x = 6:10,
y = 5:1
)
# returns all cols
df |>
group_by(id) |>
complete(x = 1:10)
# only returns id and x, no y
df |>
group_by(id) |>
expand(x = 1:10)
But I'm not sure if expand would even be any faster.
I tried doing a right join onto a df ladder = data.frame(day = 1:14)
but that resulted in the expanded rows having NA for the grouping vars, I wanted those to fill whenever an expansion took place.
Is there a faster way to get the same result as I do using complete()
?