I have this df where I have columns with date&time, date, and time. And of course the observations of CH4 and the calculated Ratio (I have more, but that is unrelevant to this question).
'data.frame': 1420847 obs. of 17 variables
$ Start : Factor w/ 1469 levels "2013-08-31 23:56:09.000",..: 2 2 2 2 2 2 2 2 2 2 ...
$ CO2 : int 1510 1950 1190 1170 780 870 730 740 680 700 ...
$ CH4 : int 66 77 62 58 34 51 36 43 32 40 ...
$ Ratio : num 0.0437 0.0395 0.0521 0.0496 0.0436 ...
$ Start_time: POSIXlt, format: "2013-11-20 00:10:05" "2013-11-20 00:10:05" "2013-11-20 00:10:05" "2013-11-20 00:10:05" ...
$ Start_date: Date, format: "2013-09-01" "2013-09-01" "2013-09-01" "2013-09-01" ...
Now I wish to split every day in six blocks of 4 hrs and to assign numbers 1 - 6 to each block. The problem, however, is that I only have the date and time at which the measurements started (Start_date
and Start_time
, or the combined Start
), so I think it is necessary to assign each new Start_time
to a block
. The length of the observations varies a lot, so there is no option of assigning a number to it. This is what I wish to accomplish:
Start Start_time Start_date CO2 CH4 Ratio block
2013-09-01 00:10:05.000 00:10:05 2013-09-01 1510 66 0.04370861 1
2013-09-01 00:10:05.000 00:10:05 2013-09-01 1950 77 0.03948718 1
2013-09-01 05:16:55.000 05:16:55 2013-09-01 1190 62 0.05210084 2
2013-09-01 05:16:55.000 05:16:55 2013-09-01 1170 58 0.04957265 2
2013-09-01 05:16:55.000 05:16:55 2013-09-01 780 34 0.04358974 2
2013-09-01 12:44:33.000 12:44:33 2013-09-01 870 51 0.05862069 4
2013-09-01 12:44:33.000 12:44:33 2013-09-01 730 36 0.04931507 4
2013-09-01 22:14:23.000 22:14:23 2013-09-01 740 43 0.05810811 6
2013-09-01 22:14:23.000 22:14:23 2013-09-01 680 32 0.04705882 6
2013-09-02 08:37:05.000 08:37:05 2013-09-02 700 40 0.05714286 3
2013-09-02 08:37:05.000 08:37:05 2013-09-02 610 35 0.05737705 3
2013-09-02 17:22:33.000 17:22:33 2013-09-02 630 25 0.03968254 5
2013-09-02 17:22:33.000 17:22:33 2013-09-02 670 40 0.05970149 5
2013-09-02 23:59:44.000 23:59:44 2013-09-02 640 37 0.05781250 6
2013-09-02 23:59:44.000 23:59:44 2013-09-02 730 35 0.04794521 6
I have searched this website and also tried Google but, so far, I have found no answer. I have tried the following code, which I found in an answer on this website but no luck.
qaa <- split(df, cut(strptime(paste(df$Start_date, df$Start_time), format = "%Y-%m-%d %H:%M"),"4 hours"))
Previously, I tried to split the number of observations in minutes, so I tried to adjust that code. And to be very honest, I have no idea what I am doing (as you can probably tell).
lst<- split(df, df$Start_date)
nobs <- "4 hours"
List <- unlist(lapply(lst, function(x) {
x$grp <- rep(1:(nrow(x)/nobs+1), each = nobs)[1:nrow(x)]
split(x, x$grp)}), recursive = FALSE)
b <- as.matrix(do.call("rbind", List))
Just to let you know, again, I am a NOOB concerning R so it takes me a lot of time to figure everything out. I understand very little of the language but I am trying my very best to make it work. I really enjoy working with it! If there is already another question like this on this website, please let me know so I can remove this.. I have not found it, though.
Thank you for taking your time to read my question and to consider to answer it!