2

I have a time series data frame with missing data, but the data frame were created in such a way that, the dates are consecutive as show below.

date,flow
1/1/1984,35.288
1/2/1984,28.858
1/3/1984,35.89
1/5/1984,1.71
1/7/1984,1.15
1/9/1984,8.99
1/16/1984,0.1
1/17/1984,10.64
1/18/1984,7.77
1/19/1984,2.59
1/20/1984,19.04
1/21/1984,32.51
1/22/1984,17.9
1/23/1984,0.74
1/30/1984,0.1
2/5/1984,1.15

What i want is results like the following

date,flow
1/1/1984,35.288
1/2/1984,28.858
1/3/1984,35.89
1/5/1984,1.71
1/7/1984,1.15
1/9/1984,8.99
1/10/1984,NA
1/11/1984,NA
1/12/1984,NA
1/13/1984,NA
1/14/1984,NA
1/15/1984,NA
1/16/1984,NA
1/16/1984,0.1
1/17/1984,10.64
1/18/1984,7.77
1/19/1984,2.59
1/20/1984,19.04
1/21/1984,32.51
1/22/1984,17.9
1/23/1984,0.74
1/30/1984,0.1
1/31/1984,NA
2/1/1984,NA
2/2/1984,NA
2/3/1984,NA
2/4/1984,NA
2/5/1984,1.15

Thanks.

Ahdy
  • 25
  • 5
  • 1
    You could create a sequence of dates you want in a new data frame using `seq` with the second column set to `NA` then left join your data frame to it by date. – Allan Cameron Nov 28 '20 at 19:01

1 Answers1

3

We can use complete

library(dplyr)
library(tidyr)
library(lubridate)
library(zoo)
df1 %>%
    mutate(date = mdy(date)) %>% 
    group_by(grp = as.yearmon(date)) %>%
    complete(date = seq(min(floor_date(date, 'day')), 
                   max(ceiling_date(date, 'day') -1), by = 'day')) %>%
    ungroup %>%
    select(-grp)

or without a group by

df1 %>%
 mutate(date = mdy(date)) %>% 
  complete(date = seq(min(date), max(date), by = 'day'))

data

df1 <- structure(list(date = c("1/1/1984", "1/2/1984", "1/3/1984", "1/5/1984", 
"1/7/1984", "1/9/1984", "1/16/1984", "1/17/1984", "1/18/1984", 
"1/19/1984", "1/20/1984", "1/21/1984", "1/22/1984", "1/23/1984", 
"1/30/1984", "2/5/1984"), flow = c(35.288, 28.858, 35.89, 1.71, 
1.15, 8.99, 0.1, 10.64, 7.77, 2.59, 19.04, 32.51, 17.9, 0.74, 
0.1, 1.15)), class = "data.frame", row.names = c(NA, -16L))
akrun
  • 874,273
  • 37
  • 540
  • 662