I have the following dataframe:
df <- structure(list(a = c(1, 43, 22, 12, 35, 113, 54, 94), b = c("a",
"b", "c", "d", "e", "f", "g", "h")), .Names = c("a", "b"), row.names = c(NA,
-8L), class = c("tbl_df", "tbl", "data.frame"))
From this data I want to select consecutive subsequences of a certain length. For example, for a sequence length of two, I want to select rows 1-2, 2-3, 3-4, and so on until the last row of the data frame. Each subsequence should then be labelled.
With a subsequence length of 2, new df
with its sequence labels would look like this:
a b seq_label
1 a 1 # First subsequence, row 1-2
43 b 1 #
43 b 2 # Second subsequence, row 2-3
22 c 2 #
22 c 3 # Third subsequence, row 3-4
12 d 3 #
12 d 4
35 e 4
35 e 5
113 f 5
113 f 6
54 g 6
54 g 7
94 h 7
Similar with a subsequence length of 3:
a b seq_label
1 a 1 # First subsequence, row 1-3
43 b 1 #
22 c 1 #
43 b 2 # Second subsequence, row 2-4
22 c 2 #
12 d 2 #
22 c 3 # Third subsequence, row 3-5
12 d 3 #
35 e 3 #
12 d 4
35 e 4
113 f 4
35 e 5
113 f 5
54 g 5
113 f 6
54 g 6
94 h 6
....
Thanks for @drjones's suggested answer I have advanced the solution:
map_dfr(1:(nrow(df) - n + 1), function (i) {cbind(df[i:(i + n - 1), ], "seq_label" = i)})