I'm not sure how fast it is, but here's a solution using split
and mapply
.
Some example data:
set.seed(1)
df <- data.frame(var1 = 1:10,
var2 = 11:20,
var3 = 21:30,
intervals = sample(0:2, 10, replace = T))
var1 var2 var3 intervals
1 1 11 21 0
2 2 12 22 1
3 3 13 23 1
4 4 14 24 2
5 5 15 25 0
6 6 16 26 2
7 7 17 27 2
8 8 18 28 1
9 9 19 29 1
10 10 20 30 0
We first sort the dataframe by intervals
:
df <- df[order(df$intervals),]
var1 var2 var3 intervals
1 1 11 21 0
5 5 15 25 0
10 10 20 30 0
2 2 12 22 1
3 3 13 23 1
8 8 18 28 1
9 9 19 29 1
4 4 14 24 2
6 6 16 26 2
7 7 17 27 2
Now we split
the data into subsets for every value of intervals
.
df1 <- split(df, df$intervals)
Now we use mapply
to simultaneously loop over the list of subsets and the vector unique(df$intervals)+1
(for you it would be +3) to select the right values.
newvalues <- mapply(function(x, y){
x[, y]
}, df1, unique(df$intervals)+1)
Finally the feed the values back to the original, sorted dataframe by using unlist
.
df$new <- unlist(newvalues)
Result:
var1 var2 var3 intervals new
1 1 11 21 0 1
5 5 15 25 0 5
10 10 20 30 0 10
2 2 12 22 1 12
3 3 13 23 1 13
8 8 18 28 1 18
9 9 19 29 1 19
4 4 14 24 2 24
6 6 16 26 2 26
7 7 17 27 2 27