How to increase time series granularity in R Dataframe?

Question

I have a dataframe that contains hourly weather information. I would like to increase the granularity of the time measurements (5 minute intervals instead of 60 minute intervals) while copying the other columns data into the new rows created:

Current Dataframe Structure:

Date                Temperature Humidity
2015-01-01 00:00:00 25          0.67
2015-01-01 01:00:00 26          0.69

Target Dataframe Structure:

Date                Temperature Humidity 
2015-01-01 00:00:00 25          0.67
2015-01-01 00:05:00 25          0.67
2015-01-01 00:10:00 25          0.67
.
.
.
2015-01-01 00:55:00 25          0.67
2015-01-01 01:00:00 26          0.69
2015-01-01 01:05:00 26          0.69
2015-01-01 01:10:00 26          0.69
.
.
.

What I've Tried:

for(i in 1:nrow(df)) {


  five.minutes <- seq(df$date[i], length = 12, by = "5 mins")

  for(j in 1:length(five.minutes)) {

    df$date[i]<-rbind(five.minutes[j])

  }
}

Error I'm getting:

Error in as.POSIXct.numeric(value) : 'origin' must be supplied

I apologize, I made an error when initially typing it in so i was just fixing the typo. No the target dataframe should not change. For example at 2015-01-01 01:00:00-01:55:00 should have temperature = 26 and humidity = 0.69 while 2015-01-01 00:00:00-00:55:00 should have temperature = 25 and humidity = 0.67 — lsch91, Feb 17 '18 at 23:07
they aren't. here let me clarify target dataframe above in the question — lsch91, Feb 17 '18 at 23:09

MKR · Accepted Answer · 2018-02-18T00:37:43.897

The one possible solution can be using fill from tidyr and right_join from dplyr.

The approach is to create date/time series between min and max+55mins times from dataframe. Left join dataframe with timeseries which will provide you all desired rows but NA for Temperature and Humidity. Now use fill to populated NA values with previous valid values.

# Data
df <- read.table(text = "Date                Temperature Humidity 
'2015-01-01 00:00:00' 25          0.67
'2015-01-01 01:00:00' 26          0.69
'2015-01-01 02:00:00' 28          0.69
'2015-01-01 03:00:00' 25          0.69", header = T, stringsAsFactors = F)

df$Date <- as.POSIXct(df$Date, format = "%Y-%m-%d %H:%M:%S")

# Create a dataframe with all possible date/time at intervale of 5 mins
Dates <- data.frame(Date = seq(min(df$Date), max(df$Date)+3540, by = 5*60))


result <- df %>%
  right_join(Dates, by="Date") %>%
  fill(Temperature, Humidity)

 result
#                  Date Temperature Humidity
#1  2015-01-01 00:00:00          25     0.67
#2  2015-01-01 00:05:00          25     0.67
#3  2015-01-01 00:10:00          25     0.67
#4  2015-01-01 00:15:00          25     0.67
#5  2015-01-01 00:20:00          25     0.67
#6  2015-01-01 00:25:00          25     0.67
#7  2015-01-01 00:30:00          25     0.67
#8  2015-01-01 00:35:00          25     0.67
#9  2015-01-01 00:40:00          25     0.67
#10 2015-01-01 00:45:00          25     0.67
#11 2015-01-01 00:50:00          25     0.67
#12 2015-01-01 00:55:00          25     0.67
#13 2015-01-01 01:00:00          26     0.69
#14 2015-01-01 01:05:00          26     0.69
#.....
#.....
#44 2015-01-01 03:35:00          25     0.69
#45 2015-01-01 03:40:00          25     0.69
#46 2015-01-01 03:45:00          25     0.69
#47 2015-01-01 03:50:00          25     0.69
#48 2015-01-01 03:55:00          25     0.69

Thanks for the help. When i try the code above i now get error message :Error in UseMethod("right_join") : no applicable method for 'right_join' applied to an object of class "function" — lsch91, Feb 18 '18 at 00:10
Yeah i have that one loaded and I have the %>% in both locations indicated above — lsch91, Feb 18 '18 at 00:17
it prints the first 58 rows (which all look modified properly) but i'm not familiar with the fill() function. Are the results supposed to be added to the "Dates" dataframe or the "df" dataframe? — lsch91, Feb 18 '18 at 00:33
@lsch91 You need to assign the result to a new dataframe or even you can assign to `df`. Let me modify the ans to make it clear for you. — MKR, Feb 18 '18 at 00:36
gotcha. I was thinking that would be the case. Thanks for your help!!!!!!!! — lsch91, Feb 18 '18 at 00:37
@lsch91 Happy to help. You can delete most of conversations as it wont help future users. — MKR, Feb 18 '18 at 00:38

score 0 · Answer 2 · answered Feb 17 '18 at 23:25

I think this might do:

df=tibble(DateTime=c("2015-01-01 00:00:00","2015-01-01 01:00:00"),Temperature=c(25,26),Humidity=c(.67,.69))
df$DateTime<-ymd_hms(df$DateTime)
DateTime=as.POSIXct((sapply(1:(nrow(df)-1),function(x) seq(from=df$DateTime[x],to=df$DateTime[x+1],by="5 min"))),
           origin="1970-01-01", tz="UTC")
Temperature=c(sapply(1:(nrow(df)-1),function(x) rep(df$Temperature[x],12)),df$Temperature[nrow(df)])
Humidity=c(sapply(1:(nrow(df)-1),function(x) rep(df$Humidity[x],12)),df$Humidity[nrow(df)])
tibble(as.character(DateTime),Temperature,Humidity)

<chr>                          <dbl>    <dbl>
 1 2015-01-01 00:00:00             25.0    0.670
 2 2015-01-01 00:05:00             25.0    0.670
 3 2015-01-01 00:10:00             25.0    0.670
 4 2015-01-01 00:15:00             25.0    0.670
 5 2015-01-01 00:20:00             25.0    0.670
 6 2015-01-01 00:25:00             25.0    0.670
 7 2015-01-01 00:30:00             25.0    0.670
 8 2015-01-01 00:35:00             25.0    0.670
 9 2015-01-01 00:40:00             25.0    0.670
10 2015-01-01 00:45:00             25.0    0.670
11 2015-01-01 00:50:00             25.0    0.670
12 2015-01-01 00:55:00             25.0    0.670
13 2015-01-01 01:00:00             26.0    0.690

How to increase time series granularity in R Dataframe?

2 Answers2