Is there a way to repeat a function a fixed number of times and save every result as a data frame?

Question

let's say I have a data frame which looks something like this

A <- c(1:100)
B <- c(0.5:100)
df <- data.frame(A,B)

And I want to get 25 random rows out of this data frame with

df[sample(nrow(df), size = 25, replace = FALSE),]

But now I want to repeat this sample function 100 times and save every result individually. I've tried to use the repeat function but I can't find a way to save every result.

Thank you.

You can also look into using a for loop. I agree with @camille that replicate would be the easiest. — Hansel Palencia, Dec 02 '19 at 20:13
Related posts: https://stackoverflow.com/q/46104176/5325862, https://stackoverflow.com/q/13313432/5325862, https://stackoverflow.com/q/39462200/5325862 — camille, Dec 02 '19 at 20:16

score 0 · Accepted Answer · answered Dec 02 '19 at 20:40

As mentioned in the comments, the replicate implementation can reach your goal, i.e.,

res <- replicate(100,df[sample(nrow(df), size = 25, replace = FALSE),],simplify = F)

An alternative is to use sapply (or lapply), i.e.,

res <- sapply(1:100, function(k) df[sample(nrow(df), size = 25, replace = FALSE),],simplify = F)

or

res <- lapply(1:100, function(k) df[sample(nrow(df), size = 25, replace = FALSE),])

score 0 · Answer 2 · answered Dec 02 '19 at 20:50

replicate() is a great option for this problem.

If you would like your final results in a single table with a column for the ID variable, you can use bind_rows() from the dplyr package. Here is a smaller example (3 samples from a data set of 10 rows) that may allow easier understanding of replicate()'s behavior:

library(dplyr, warn.conflicts = FALSE)

# make a smaller data set of 10 rows
d <- data.frame(
  A = 1:10,
  B = LETTERS[1:10]
) %>% print
#>     A B
#> 1   1 A
#> 2   2 B
#> 3   3 C
#> 4   4 D
#> 5   5 E
#> 6   6 F
#> 7   7 G
#> 8   8 H
#> 9   9 I
#> 10 10 J

# create 3 samples, with each sample containing 4 rows
reps <- replicate(3, d[sample(nrow(d), 4, FALSE), ], simplify = FALSE) %>% print
#> [[1]]
#>   A B
#> 2 2 B
#> 5 5 E
#> 6 6 F
#> 1 1 A
#> 
#> [[2]]
#>   A B
#> 3 3 C
#> 2 2 B
#> 5 5 E
#> 8 8 H
#> 
#> [[3]]
#>   A B
#> 4 4 D
#> 9 9 I
#> 3 3 C
#> 8 8 H

# bind the list elements into a single tibble, with an ID column for the sample
bind_rows(reps, .id = "sample_id")
#>    sample_id A B
#> 1          1 2 B
#> 2          1 5 E
#> 3          1 6 F
#> 4          1 1 A
#> 5          2 3 C
#> 6          2 2 B
#> 7          2 5 E
#> 8          2 8 H
#> 9          3 4 D
#> 10         3 9 I
#> 11         3 3 C
#> 12         3 8 H

^{Created on 2019-12-02 by the reprex package (v0.3.0)}

Thank you. I am using ````myfun <- function(){ df[sample(nrow(df), size = 25, replace = FALSE),] }```` for my function right now. Is there a way to get the sum of column A in my function for every sample? — C. Toni, Dec 02 '19 at 21:12
Please don't make the same comment after multiple answers. It makes it harder to track responses. — davechilders, Dec 02 '19 at 21:34

Is there a way to repeat a function a fixed number of times and save every result as a data frame?

2 Answers2