I have a data frame "v" with id and value columns, such as:
set.seed(123)
v <- data.frame(id=sample(1:5),value=sample(1:5))
v
id value
1 2 1
2 4 3
3 5 4
4 3 2
5 1 5
In the loop, I want to find the index of v which v's id matches tmp and then find the subset of v based on this index. tmp is a sample with "replacement" of v$id
Here is my attempt:
df <- vector(mode='list',length = iter)
iter = 1
for (i in 1:iter)
{
tmp <- sample(v$id, replace=T)
index.position <- NULL
for (j in 1:length(tmp)) {index.position <- c(index.position, which(v$id %in% tmp[j]) )}
df[[i]] <- v[index.position,]
}
tmp
[1] 1 5 3 5 2
df
[[1]]
id value
5 1 5
3 5 4
4 3 2
3.1 5 4
1 2 1
This works as expected. However, the execution is very slow when both "v" and "iter" are large because growing the index.position array is not memory efficient.
I have also tried to create an empty matrix or list as a placeholder and then assign index.position to it as I loop, but did not really speed up the process. (reference: Growing a data.frame in a memory-efficient manner)
Edit: id "isn't" unique in v