1

It's my first try with lists. I try to clean up my code and put some code in functions. One idea is to subset a big dataframe in multiple subsets with a function. So I can call the subset with a function when needed.

With the mtcars dataframe I would like to explain what I am trying to do:

  1. add an id to mtcars
  2. create a function with one argument (mtcars) that outputs a list of subset dataframes (mtcars1, mtcars2, mtcars3) -> learned here: How to assign from a function which returns more than one value? answer by Federico Giorgi

What I achieve is to create the list. But when it comes to see the 3 subset dataframe objects (mtcars1, mtcars2, mtcars3) in the global environment my knowledge is ending. So how can I call these 3 dataframe objects from the list with my function. Thanks!

My Code:

library(dplyr)
# add id to mtcars
mtcars <- mtcars %>% 
    mutate(id = row_number())

# create function to subset in 3 dataframes

my_func_cars <- function(input){
    # first subset
    mtcars1 <- mtcars %>% 
        select(id, mpg, cyl, disp)
    
    # second subset
    mtcars2 <- mtcars  %>% 
        select(id, hp, drat, wt, qsec)
    
    # third subset
    mtcars3 <- mtcars %>% 
        select(id, vs, am, gear, carb)
    
    output <- list(mtcars1, mtcars2, mtcars3)
    return(output)
}


output<-my_func_cars(mtcars)

for (i in output) {
    print(i)
}
TarJae
  • 72,363
  • 6
  • 19
  • 66
  • 1
    You do not need to run a `for()` loop to see your data frames. You can just run: `output` and they will all show up in the console. If you want to see a specific data frame, you can subset a list with `[[`: `output[[1]]`, `output[[2]]` and `output[[3]]`. – SavedByJESUS Jan 01 '21 at 18:09
  • Very helpful advice. Thanks. – TarJae Jan 01 '21 at 18:22

1 Answers1

1

It may be better to output a named list

library(dplyr)
library(stringr)


my_func_cars <- function(input){

    nm1 <- deparse(substitute(input))
    # first subset
    obj1 <- input %>% 
        select(id, mpg, cyl, disp)
    
    # second subset
    obj2 <- input  %>% 
        select(id, hp, drat, wt, qsec)
    
    # third subset
    obj3 <- input %>% 
        select(id, vs, am, gear, carb)
    
   dplyr::lst(!! str_c(nm1, 1) := obj1, 
              !! str_c(nm1, 2) := obj2, 
              !! str_c(nm1, 3) := obj3)
    
}

and then we use list2env to create objects in the global env

mtcars <- mtcars %>% 
               mutate(id = row_number())
list2env(my_func_cars(mtcars), .GlobalEnv)

-check the objects

head(mtcars1, 2)
#  id mpg cyl disp
#1  1  21   6  160
#2  2  21   6  160

head(mtcars2, 2)
#  id  hp drat    wt  qsec
#1  1 110  3.9 2.620 16.46
#2  2 110  3.9 2.875 17.02


head(mtcars3,  2)
#  id vs am gear carb
#1  1  0  1    4    4
#2  2  0  1    4    4
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    Thanks akrun. Your very first solution was the clearest for me. Can you post this again. I think in your newest edit: mtcars <- mtcars %>% mutate(id = row_number()) is missing. And could you just explain nm1 <- deparse(substitute(input)) and lst(!! str_c(nm1, 1) := obj1, .. Thank you in advance! – TarJae Jan 01 '21 at 18:21
  • 1
    @TarJae Thannks, I updated the input data passed into the function with added `id`. Regarding the `deparse`, substitutate. The object passed is a data.frame `mtcars`, so, by doing the `deparse(substitute`, we get the object name as a string `"mtcars"`, which we will used in creating the name for the list element. The `dplyr/purrr` `lst` have the option to assign (`:=`) names on the lhs based on the value of object. I changed the hardcoded `mtcars1`, `mtcars2` to more automatic way in the function. It may be also better to have more arguments with column names – akrun Jan 01 '21 at 18:25