Is there an easy way to simplify this code using a loop?

Question

Is there a way to simplify this code using a loop?

set.seed(100) 

AL_INDEX <- sample(1:nrow(AL_DF), 0.7*nrow(AL_DF))
AL_TRAIN <- AL_DF[AL_INDEX,]
AL_TEST <- AL_DF[-AL_INDEX,]  

AR_INDEX <- sample(1:nrow(AR_DF), 0.7*nrow(AR_DF))
AR_TRAIN <- AR_DF[AR_INDEX,]
AR_TEST <- AR_DF[-AR_INDEX,]  

AZ_INDEX <- sample(1:nrow(AZ_DF), 0.7*nrow(AZ_DF))
AZ_TRAIN <- AZ_DF[AZ_INDEX,]
AZ_TEST <- AZ_DF[-AZ_INDEX,]

AL_DF, AR_DF & AZ_DF are data frames that have the same field structure, but different number of records.

Do you want all those as separate dataframes or you are ok to keep them in a list? — Ronak Shah, May 28 '20 at 01:32
I would prefer them to be in there own data frames because they will be called on later in the model. — Andrew Hicks, May 28 '20 at 01:35

score 2 · Accepted Answer · answered May 28 '20 at 01:48

Find a pattern to capture all the dataframe names. In the example shared all of them end with "_DF", use mget to get them in list. Divide the data in test and train and unlist them one level.

data <- unlist(lapply(mget(ls(pattern = '_DF$')), function(df) {
            index <- sample(1:nrow(df), 0.7*nrow(df))
            list(train = df[index,], test = df[-index,])  
         }), recursive = FALSE)

Now get them into individual dataframes using list2env.

list2env(data, .GlobalEnv)

Is there an easy way to simplify this code using a loop?

1 Answers1

Linked