0

I have a dataframe (df) of time series data in long format that has a variable ID and Day and Amount.

id  day amount
1   41  15.29
1   42  15.29
1   43  15.29
1   44  15.29
1   45  15.3
1   46  15.3
1   47  15.3
1   48  15.31
1   49  15.31
1   50  15.31
1   52  15.32
1   53  15.32
1   54  15.32
1   55  15.32
1   58  15.33
1   59  15.34
1   60  15.34
1   61  15.34

Notice that day 51 is missing. What I want to do is locate the values that are out of missing and then add the row, and renumber then repeat for the entire df.

So it would end up like this:

1   41  15.29
1   42  15.29
1   43  15.29
1   44  15.29
1   45  15.3
1   46  15.3
1   47  15.3
1   48  15.31
1   49  15.31
1   50  15.31
-99 51  -99
1   52  15.32
1   53  15.32
1   54  15.32
1   55  15.32
1   58  15.33
1   59  15.34
1   60  15.34
1   61  15.34

1 Answers1

0

You could use complete to fill the missing days.

library(dplyr)
library(tidyr)

df1 <- df %>%
        mutate(DAY1 = DAY) %>%
        complete(ID, DAY, fill = list(DAY1 = -99)) %>%
        select(ID, DAY = DAY1)
df1
#      ID   DAY
#   <int> <dbl>
# 1     1     1
# 2     1     2
# 3     1     3
# 4     1     4
# 5     2     1
# 6     2   -99
# 7     2     3
# 8     2     4
# 9     3     1
#10     3   -99
#11     3     3
#12     3   -99

For the updated data we can try to add day by group :

df1 <- df %>%
        group_by(id) %>%
        complete(day = min(day):max(day), fill = list(amount = -99)) 
df1

data

df <- structure(list(ID = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L), DAY = c(1L, 
2L, 3L, 4L, 1L, 3L, 4L, 1L, 3L)), class = "data.frame", row.names = c(NA, -9L))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213