0

I'd like to write a loop that creates five new data sets in R, each one containing a different number of observations from an original data frame, df.

Here is my current code, it outputs the value of dfi as a string rather than the actual object ("df[4:42 + i]" instead of df[4:42 + i]).

for(i in 1:5)
{  nam <- paste("df",i, sep="")
assign(nam, eval(paste("df","[1:44 + ",i,",]", sep="")))
}

I'd like to return the df object when it loops, but I don't know how to do that. Any suggestions? Thank you very much in advance.

Slash
  • 501
  • 2
  • 9
  • 4
    If you're going to be doing the same thing with all five sets, I suggest it would be better to use `lapply(...)` and store the 5 separate frames in a single `list`. https://stackoverflow.com/questions/17499013/how-do-i-make-a-list-of-data-frames/24376207#24376207 – r2evans Jan 09 '19 at 21:08
  • Which df do you want to return? You just created 5 of them? And `for` loops in R don't return anything. I agree with the above comment that you should probably be using `lapply` or `map` from `purrr`. You should avoid `assign()` in normal R code. It causes more problems than it solves. – MrFlick Jan 09 '19 at 21:11
  • 1
    *"It causes more problems than it solves."* That should be the title of a blog-post with "Not-So-Best Practices in R", subtitled "Common mistakes/misconceptions that look okay on the surface". I've had to troubleshoot others' scripts that used `assign`, `eval`, `get`, and `<<-`, and it almost always ends in more-caffeine, a headache ... and some recommendations. – r2evans Jan 09 '19 at 21:14

2 Answers2

0

Given a sample dataset:

df <- mtcars

And here's the list of frames:

list_of_frames <- lapply(1:5, function(i) df[1:3 + i,])
list_of_frames[[3]]
#                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
# Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
# Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
# Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

(If you really like the names, you can also do

names(list_of_frames) <- paste0("df", 1:5)
list_of_frames[["df3"]]

If you really need to keep each variable separate, then here's the loop:

ls() # proof that they don't exist yet
# [1] "df"
for (i in 1:5) assign(paste0("df", i), df[1:3 + i,])
ls()
# [1] "df"  "df1" "df2" "df3" "df4" "df5" "i"  
df3
#                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
# Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
# Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
# Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
r2evans
  • 141,215
  • 6
  • 77
  • 149
0

I take this as sampling a data frame n number of times with repeating being okay. You can do this with lapply and some tidyverse.

floor(runif(5, 10, 30))

This generates 5 integers from 10 to 30. Change these as you like.

function(x) mtcars %>% sample_n(x)

This takes a dataframe (mtcars), and samples some number of rows from the dataframe.

lDF <- lapply(floor(runif(5, 10, 30)), function(x) mtcars %>% sample_n(x))

This puts it together using lapply with creates a list of dataframes that you can reference as lDF[1] as you like

Sahir Moosvi
  • 549
  • 2
  • 21