You could do this as a list. I am creating different data frames based on which midpoint you are subsetting.
mydata<-data.frame(dataelement1 = rep(1:20),dataelement2=letters[1:20]) ## create sample data with 20 rows
idx <- seq(from=3,to=(nrow(mydata)-3),by=3) ##identify midpoint for 20 rows with slide 5
for (i in idx){
start = i-2; stop = i+2
assign(paste0("newdata",i),mydata[start:stop,])
}
This yields the following 5 dataframes named with the midpoint index used.
> newdata3
dataelement1 dataelement2
1 1 a
2 2 b
3 3 c
4 4 d
5 5 e
> newdata6
dataelement1 dataelement2
4 4 d
5 5 e
6 6 f
7 7 g
8 8 h
> newdata9
dataelement1 dataelement2
7 7 g
8 8 h
9 9 i
10 10 j
11 11 k
> newdata12
dataelement1 dataelement2
10 10 j
11 11 k
12 12 l
13 13 m
14 14 n
> newdata15
dataelement1 dataelement2
13 13 m
14 14 n
15 15 o
16 16 p
17 17 q
With List solution:
## Putting results into a single list object instead##
mylist<-list()
j=1
for (i in idx){
start = i-2; stop = i+2
mylist[[j]] <- mydata[start:stop,]
j=j+1
}
## The index is sequentially indexed ##
mylist[[1]]
mylist[[5]]
> mylist[[1]]
dataelement1 dataelement2
1 1 a
2 2 b
3 3 c
4 4 d
5 5 e
> mylist[[5]]
dataelement1 dataelement2
13 13 m
14 14 n
15 15 o
16 16 p
17 17 q
EDIT: As requested.
seq()
creates a sequence. Since your first window is at 3, we tell it to start at 3. Since you want the window to move 3 units each time, we have by = 3
. The end is defined by the number of rows in your data using nrow()'. We subtract 3 from that because we don't want a situation where the row number fails to have 2 more rows in the
mydata`.
I should have used -2
because row 18 would still have 2 rows ahead of it.
So this creates a vector idx
which equals c(3,6,9,12,15) that we will use in the loop.
for (i in idx){
start = i-2; stop = i+2
assign(paste0("newdata",i),mydata[start:stop,])
}
for (i in idx) {
This says to loop over every value contained in idx.
start = i-2; stop = i+2
So for the first value of idx=3, we define start=1 and stop=5, you window.
The last line defines a data set with prefix = newdata
and with suffix equal to the value of idx we happen to be looping over. It then subsets your data set based on the values of start and stop defined on the line prior.
So first time through the loop, the last line resolves to: newdata3 <- mydata[1:5,]
which is taking the 5 records of interest (with all columns).
There are various ways of sub-setting in R. This is a good reference.
mydata[1:5,1:2]
would subset not only rows 1 thru 5, but also columns 1 and 2.