1

I have a csv data file that has hour level data. I have to copy this one week of CSV data to a hourly CSV file for one year. Putting simply, copy 1 week worth of hourly data into 52 weeks (yearly CSV file). How can I do this in R?

For example, I have a weekly data starting from 6/1/20 (Monday) to the end of that week on 6/7/20 (Sunday).

SN  TimeStamp      OCCUPANCY
1   6/1/20 01:00    2
2   6/1/20 02:00    2
3   6/1/20 03:00    2
4   6/1/20 04:00    0
5   6/1/20 05:00    0
.
.
.
UPTO ONE WEEK (6/7/20)

How can I copy above weekly data for a whole year starting on January 1, 2020?

SN  TimeStamp      OCCUPANCY
1   1/1/20 01:00    0
2   1/1/20 02:00    0
3   1/1/20 03:00    2
4   1/1/20 04:00    0
5   1/1/20 05:00    5
.
.
.
.

Data from Monday of the weekly table must match with all the Mondays in a year. And so is true for all other days.

Thanks.

Dave2e
  • 22,192
  • 18
  • 42
  • 50

1 Answers1

0

You can expand your dataset by added/subtracting full weeks, and then filter for 2020 data.

library(tidyverse)

df1 <- data.frame(
  "TimeStamp" = c("6/1/20 01:00", "6/1/20 02:00", "6/1/20 03:00", "6/1/20 04:00", "6/1/20 05:00"),
  "OCCUPANCY" = c(2, 2, 2, 0, 0)) %>%
  dplyr::mutate(TimeStamp = strptime(TimeStamp, format = "%m/%d/%y %H:%M", tz = "UTC"))

dfyear <- purrr::map_dfr(-25:35, function(w) {
  df1 %>% dplyr::mutate(TimeStamp = TimeStamp + lubridate::weeks(w))
}) %>%
  dplyr::filter(
    TimeStamp > "2019-12-31",
    TimeStamp < "2021-01-01"
  )

Since the data here only covers 5 hours per week, the first and last days in this minimal example are not Jan 1st and Dec 31st. However, if you start with a full week of data, the results will cover the whole year.

head(dfyear)
#>             TimeStamp OCCUPANCY
#> 1 2020-01-06 01:00:00         2
#> 2 2020-01-06 02:00:00         2
#> 3 2020-01-06 03:00:00         2
#> 4 2020-01-06 04:00:00         0
#> 5 2020-01-06 05:00:00         0
#> 6 2020-01-13 01:00:00         2


tail(dfyear)
#>               TimeStamp OCCUPANCY
#> 255 2020-12-21 05:00:00         0
#> 256 2020-12-28 01:00:00         2
#> 257 2020-12-28 02:00:00         2
#> 258 2020-12-28 03:00:00         2
#> 259 2020-12-28 04:00:00         0
#> 260 2020-12-28 05:00:00         0

The final dataframe can the be exported to csv, e.g. with readr::write_csv(dfyear, file = "yourtarget.csv").

Created on 2022-03-05 by the reprex package (v2.0.1)

pholzm
  • 1,719
  • 4
  • 11